Independent Study & Mentorship

About
GreenInfer

A research-backed, open-source Green Orchestration Framework for sustainable AI inference. Built as an ISM Final Product to show that AI efficiency and environmental responsibility can go hand in hand.

ISM Final Product Frisco ISD Green AI Research
The Problem

AI's hidden energy cost

Training and running large language models consumes enormous amounts of electricity. A single ChatGPT query uses roughly 10x the energy of a Google search, and the gap keeps growing as models get larger and usage scales globally.

The core inefficiency is that every query — whether "hi" or "design a distributed system" — gets routed to the same massive model. There is no intelligence in the routing, no awareness of energy cost, and no attempt at optimization.

GreenInfer tackles this at the infrastructure level, before a single GPU cycle is burned on inference.

The Opportunity

70%+ savings available today

Research shows that 60 to 70 percent of real-world AI queries are simple enough to be handled by small, efficient models, if the routing decision is made correctly. Papers like FrugalGPT back this up empirically.

Combining smart routing with prompt optimization and carbon-aware scheduling creates a compounding effect where savings from each layer stack together.

The models, tools, and APIs exist today. GreenInfer integrates them into a unified, developer-friendly framework.

Project Status

Build roadmap

ComponentDescriptionStatus
GreenPromptsOptimizerT5-based prompt compression, fine-tuned on 127+ prompt pairs, hosted on Hugging Face✓ Complete
Complexity ScorerRule-based linguistic scorer using entropy, token length, and task classification✓ Complete
DistilBERT ClassifierFine-tuned on 600 labeled prompts, 98.9% accuracy on validation set✓ Complete
Orchestration EngineCore routing logic combining complexity, energy, mode, and carbon budget✓ Complete
Cascade InferenceSmall→Medium→Large escalation with confidence-based gating✓ Complete
Energy EstimatorToken-based proxy estimation with CodeCarbon integration✓ Complete
WebsiteFull multi-page site: landing, chat UI, framework docs, impact, about✓ Complete
HF Space BackendFastAPI server with CORS, optimizer, Groq cascade, carbon metrics✓ Live
Carbon-Aware RoutingERCOT real-time grid intensity for dynamic mode adjustment⌛ In Progress
User AccountsSign up, login, persistent chat history via Supabase⌛ In Progress
RL RouterPolicy gradient agent learning optimal routing from history→ Planned
The Team

Researcher & Mentor

🧑‍💻
Srinesh Toranala
Student Researcher  ·  ISM Program, Frisco ISD

Frisco ISD student building GreenInfer as an ISM final project. GreenInfer brings together a year of research in AI efficiency, energy systems, and sustainable computing. Previously built GreenPromptsOptimizer, a T5-based prompt compression model that forms the first layer of the GreenInfer pipeline. The project spans model training, framework engineering, backend deployment, and public product launch — built to make green AI genuinely accessible to developers everywhere.

PythonPyTorchTransformersFastAPIHugging FaceGreen AI
🎓
Marta Adamska
PhD Candidate  ·  University of Lancaster  ·  Mentor

PhD Candidate at the University of Lancaster researching AI systems, sustainability, and computational efficiency. Ms. Adamska's expertise in sustainable computing has been deeply instrumental to this project, providing the research direction, key papers, and guidance that shaped GreenInfer from an early idea into a working framework. Her mentorship and technical depth pushed the rigor of this work throughout the research process.

ISM Program

Skills demonstrated

🧠
AI Research
Literature review spanning energy-efficient inference, cascading, and sustainable agent design
💻
ML Engineering
Fine-tuning T5 and DistilBERT, building classifiers, and production-grade inference pipelines
Systems Design
Multi-layer pipeline balancing accuracy, latency, and real-time carbon intensity
🌐
Full-Stack Dev
HTML/CSS/JS frontend, FastAPI backend on HF Spaces, Groq API integration
📊
Data Analysis
Benchmarking, Pareto frontier analysis, energy comparison, statistical evaluation
🌿
Sustainability
Carbon-aware computing, ERCOT grid modeling, per-session CO₂ budget enforcement
Research Foundation

Key references

Papers recommended by Ms. Adamska and reviewed during research. These directly shaped the design decisions in GreenInfer.

Towards Greener LLMs
2024  ·  arXiv:2403.20306  ·  Core motivation for energy-aware LLM deployment
arxiv.org/pdf/2403.20306
FrugalGPT: Reducing LLM Cost and Improving Performance
Chen et al. (2023)  ·  Stanford  ·  Core inspiration for cascade routing architecture
Demonstrated cascading smaller models saves 40-90% of compute cost
Budget ML Agent
ACM  ·  dl.acm.org/doi/full/10.1145/3703412.3703416  ·  Cost-aware agent design adapted for energy budgeting
dl.acm.org/doi/full/10.1145/3703412.3703416
EnergAgent: Energy-Aware Agent Framework
GI  ·  Direct inspiration for energy-aware routing decisions
dl.gi.de/items/4ee3b7d1-80a3-46c8-9eb6-26985eb607ab
How Hungry is AI?
2025  ·  arXiv:2505.09598  ·  Empirical energy consumption numbers for LLMs
arxiv.org/pdf/2505.09598
DynamoLLM: Dynamic LLM Serving
arXiv:2408.00742  ·  Energy-aware LLM serving; cascading model ideas
arxiv.org/abs/2408.00742
Energy and Policy Considerations for Deep Learning in NLP
Strubell et al. (2019)  ·  ACL  ·  Foundational paper quantifying carbon cost of model training
CodeCarbon: Estimating the Carbon Footprint of Computation
Lacoste et al. (2019)  ·  NeurIPS Workshop  ·  Energy benchmarking library integrated in framework
Get Involved

Open source. Open science.

GreenInfer is built to be shared. The framework is on GitHub for any developer to use, extend, or build on top of.

Try the Chat Read the Docs