GreenInfer - About the Project

The Problem

AI's hidden energy cost

Training and running large language models consumes enormous amounts of electricity. A single ChatGPT query uses roughly 10× the energy of a Google search - and the gap is growing as models get larger and usage explodes.

The core inefficiency: every query - from "hi" to "write me a PhD thesis" - gets routed to the same massive model. There's no intelligence in the routing. No awareness of energy cost. No optimization.

GreenInfer addresses this at the infrastructure level, before a single GPU cycle is burned.

The Opportunity

70%+ savings available today

Research shows that 60–70% of real-world AI queries are simple enough to be handled by small, efficient models - if the routing decision is made correctly.

Combining smart routing with prompt optimization (reducing wasted tokens before inference) and carbon-aware scheduling creates a compounding effect: savings from every component stack.

This isn't a future solution. The models, tools, and APIs exist today. GreenInfer integrates them into a unified framework anyone can use.

Project Status

Build roadmap

Component	Description	Status
GreenPromptsOptimizer	T5-based prompt compression model, fine-tuned, hosted on Hugging Face	✓ COMPLETE
Website (this site)	Full multi-page site: landing, chat UI, framework docs, impact dashboard, about	✓ COMPLETE
Complexity Scorer	Classifier model for prompt complexity 0–100 using linguistic features	⟳ IN PROGRESS
Orchestration Engine	Core routing logic: complexity + energy + mode → model tier selection	⟳ IN PROGRESS
Energy Estimation Module	Token-based proxy + CodeCarbon GPU benchmarking	⟳ IN PROGRESS
Carbon-Aware Routing	ERCOT real-time grid carbon intensity API integration	→ NEXT
Reinforcement Learning Router	Policy gradient agent trained on historical accuracy/energy data	→ PLANNED
Live Backend Integration	Connect chat UI to real model API calls via orchestration layer	→ PLANNED
Open Source Release	Framework published to GitHub with documentation for developer use	→ FINAL

The Team

Researcher & Mentor

🧑‍💻

Srinesh Toranala

Student Researcher · ISM Program

Frisco ISD student building GreenInfer as an Independent Study & Mentorship final project. The project synthesizes a year of research in AI efficiency, energy systems, and sustainable computing. Srinesh previously built GreenPromptsOptimizer - a T5-based prompt compression model that now forms the first layer of the GreenInfer pipeline. This project represents a full-stack effort: research, model training, framework design, and deployment - built with the goal of making green AI accessible to every developer.

Python PyTorch Transformers Hugging Face Green AI

🎓

Marta Adamska

PhD Candidate · University of Lancaster · Mentor

Marta Adamska is a PhD Candidate at the University of Lancaster whose research intersects AI systems, sustainability, and computational efficiency. As the project mentor, Dr. Adamska helped refine the scope of GreenInfer - prioritizing the orchestration framework as the core deliverable and the chatbot as its showcase. Her guidance emphasized that the project's feasibility is grounded in the foundational work of the prompt optimization model, and helped shape the research direction toward a rigorous, achievable timeline.

ISM Program

Skills demonstrated

🧠

AI Research

Academic literature review on energy-efficient inference, model quantization, and sustainable AI deployment

💻

ML Engineering

Fine-tuning T5 models, building complexity classifiers, and implementing RL routing policies in PyTorch

⚡

Systems Design

Multi-layer pipeline architecture balancing accuracy, latency, and real-time carbon intensity constraints

🌐

Full-Stack Dev

End-to-end web application: HTML/CSS/JS frontend, Python backend, Hugging Face API integration

📊

Data Analysis

Empirical benchmarking with CodeCarbon, Pareto frontier analysis, and statistical energy modeling

🌿

Sustainability

Applying carbon-aware computing principles and grid intensity awareness to real-world AI infrastructure

Research Foundation

Key references

Energy and Policy Considerations for Deep Learning in NLP

Strubell et al. (2019) · ACL · Foundation for AI carbon cost quantification

Efficiently Scaling Transformer Inference

Pope et al. (2022) · Google Research · Model efficiency and routing strategies

FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance

Chen et al. (2023) · Stanford · LLM cascade and routing inspiration

CodeCarbon: Estimating the Carbon Footprint of Computation

Lacoste et al. (2019) · NeurIPS Workshop · Energy benchmarking library used in framework

Mixture of Experts: A Survey

Cai et al. (2024) · arXiv · Sparse routing and model specialization

Carbon-Aware Computing for Datacenters

Wiesner et al. (2021) · IEEE · Grid-aware workload scheduling foundation

About
GreenInfer

AI's hidden energy cost

70%+ savings available today

Build roadmap

Researcher & Mentor

Skills demonstrated

Key references

Open source. Open science.

AboutGreenInfer

AI's hidden energy cost

70%+ savings available today

Build roadmap

Researcher & Mentor

Skills demonstrated

Key references

Open source. Open science.

About
GreenInfer