Environmental Impact

Every prompt
tracked & measured

Quantified sustainability gains from prompt optimization, smart routing, and carbon-aware inference. Real numbers. Real savings.

Stats update in real-time as the framework is used

⚡

Avg Energy Saved

↑ vs always-large baseline

✂️

Token Reduction

↑ from prompt optimizer

🌿

CO₂ Avoided (demo)

↓ vs baseline inference

🔢

Tokens Saved (demo)

↑ across test queries

Model Routing Distribution

% of queries routed to each tier (balanced mode)

SmolLM2 (S)

62% ~0.9 mWh

Phi-3 Mini (M)

28% ~3.8 mWh

Llama 3 70B (L)

10% ~48 mWh

Avg energy per query: 4.2 mWh

Baseline (always large): ~48 mWh

Where the savings come from

Cumulative energy reduction by framework component

✂️ Prompt Optimization −38% tokens → −38% energy

⚡ Model Tier Routing (S→L avoid) −91% per routed query

🌿 Carbon-Aware Scheduling −12% CO₂ from grid timing

🔄 RL Policy Improvement ~8% additional over time

Combined avg reduction ~73%

Pareto Frontier - Accuracy vs Energy

Optimal routing curve across model tiers

Points on the curve = optimal routing choices. GreenInfer selects the lowest-energy point that meets accuracy requirements.

CO₂ Emissions - GreenInfer vs Baseline

Simulated across 20 benchmark queries

GreenInfer

Always-Large Baseline

GreenInfer total: 0.84g CO₂

Baseline total: 3.12g CO₂

Grid Awareness

ERCOT Carbon Intensity

CURRENT INTENSITY Live

198 gCO₂/kWh

0 (Clean) 400 (Moderate) 800+ (Heavy)

Grid favorable → Balanced routing enabled

At >400 gCO₂/kWh: forced eco mode. At <150: performance unlocked.

ROUTING DECISIONS THIS SESSION

Grid-triggered eco shifts 3

Queries budget-capped 0

Complexity prevents over-routing 7

Total CO₂ saved (session) 2.28g

Every prompttracked & measured

ERCOT Carbon Intensity

Every prompt
tracked & measured