Every prompt
tracked & measured

Quantified sustainability gains from prompt optimization, smart routing, and carbon-aware inference. Real numbers. Real savings.

Stats update in real-time as the framework is used
0%
Avg Energy Saved
↑ vs always-large baseline
✂️
0%
Token Reduction
↑ from prompt optimizer
🌿
0g
CO₂ Avoided (demo)
↓ vs baseline inference
🔢
0
Tokens Saved (demo)
↑ across test queries
Model Routing Distribution
% of queries routed to each tier (balanced mode)
SmolLM2 (S)
62% ~0.9 mWh
Phi-3 Mini (M)
28% ~3.8 mWh
Llama 3 70B (L)
10% ~48 mWh
Avg energy per query: 4.2 mWh
Baseline (always large): ~48 mWh
Where the savings come from
Cumulative energy reduction by framework component
✂️ Prompt Optimization −38% tokens → −38% energy
⚡ Model Tier Routing (S→L avoid) −91% per routed query
🌿 Carbon-Aware Scheduling −12% CO₂ from grid timing
🔄 RL Policy Improvement ~8% additional over time
Combined avg reduction ~73%
Pareto Frontier - Accuracy vs Energy
Optimal routing curve across model tiers
Points on the curve = optimal routing choices. GreenInfer selects the lowest-energy point that meets accuracy requirements.
CO₂ Emissions - GreenInfer vs Baseline
Simulated across 20 benchmark queries
GreenInfer
Always-Large Baseline
GreenInfer total: 0.84g CO₂
Baseline total: 3.12g CO₂

ERCOT Carbon Intensity

CURRENT INTENSITY Live
198 gCO₂/kWh
0 (Clean) 400 (Moderate) 800+ (Heavy)
Grid favorable → Balanced routing enabled
At >400 gCO₂/kWh: forced eco mode. At <150: performance unlocked.
ROUTING DECISIONS THIS SESSION
Grid-triggered eco shifts 3
Queries budget-capped 0
Complexity prevents over-routing 7
Total CO₂ saved (session) 2.28g