About
GreenInfer

A research-backed, open-source Green Orchestration Framework for sustainable AI inference - built as an ISM Final Product, proving that AI doesn't have to cost the planet.

ISM Final Product Frisco ISD Green AI Research

AI's hidden energy cost

Training and running large language models consumes enormous amounts of electricity. A single ChatGPT query uses roughly 10× the energy of a Google search - and the gap is growing as models get larger and usage explodes.

The core inefficiency: every query - from "hi" to "write me a PhD thesis" - gets routed to the same massive model. There's no intelligence in the routing. No awareness of energy cost. No optimization.

GreenInfer addresses this at the infrastructure level, before a single GPU cycle is burned.

70%+ savings available today

Research shows that 60–70% of real-world AI queries are simple enough to be handled by small, efficient models - if the routing decision is made correctly.

Combining smart routing with prompt optimization (reducing wasted tokens before inference) and carbon-aware scheduling creates a compounding effect: savings from every component stack.

This isn't a future solution. The models, tools, and APIs exist today. GreenInfer integrates them into a unified framework anyone can use.

Build roadmap

ComponentDescriptionStatus
GreenPromptsOptimizer T5-based prompt compression model, fine-tuned, hosted on Hugging Face ✓ COMPLETE
Website (this site) Full multi-page site: landing, chat UI, framework docs, impact dashboard, about ✓ COMPLETE
Complexity Scorer Classifier model for prompt complexity 0–100 using linguistic features ⟳ IN PROGRESS
Orchestration Engine Core routing logic: complexity + energy + mode → model tier selection ⟳ IN PROGRESS
Energy Estimation Module Token-based proxy + CodeCarbon GPU benchmarking ⟳ IN PROGRESS
Carbon-Aware Routing ERCOT real-time grid carbon intensity API integration → NEXT
Reinforcement Learning Router Policy gradient agent trained on historical accuracy/energy data → PLANNED
Live Backend Integration Connect chat UI to real model API calls via orchestration layer → PLANNED
Open Source Release Framework published to GitHub with documentation for developer use → FINAL

Researcher & Mentor

🧑‍💻
Srinesh Toranala
Student Researcher · ISM Program

Frisco ISD student building GreenInfer as an Independent Study & Mentorship final project. The project synthesizes a year of research in AI efficiency, energy systems, and sustainable computing. Srinesh previously built GreenPromptsOptimizer - a T5-based prompt compression model that now forms the first layer of the GreenInfer pipeline. This project represents a full-stack effort: research, model training, framework design, and deployment - built with the goal of making green AI accessible to every developer.

Python PyTorch Transformers Hugging Face Green AI
🎓
Marta Adamska
PhD Candidate · University of Lancaster · Mentor

Marta Adamska is a PhD Candidate at the University of Lancaster whose research intersects AI systems, sustainability, and computational efficiency. As the project mentor, Dr. Adamska helped refine the scope of GreenInfer - prioritizing the orchestration framework as the core deliverable and the chatbot as its showcase. Her guidance emphasized that the project's feasibility is grounded in the foundational work of the prompt optimization model, and helped shape the research direction toward a rigorous, achievable timeline.

Skills demonstrated

🧠
AI Research
Academic literature review on energy-efficient inference, model quantization, and sustainable AI deployment
💻
ML Engineering
Fine-tuning T5 models, building complexity classifiers, and implementing RL routing policies in PyTorch
Systems Design
Multi-layer pipeline architecture balancing accuracy, latency, and real-time carbon intensity constraints
🌐
Full-Stack Dev
End-to-end web application: HTML/CSS/JS frontend, Python backend, Hugging Face API integration
📊
Data Analysis
Empirical benchmarking with CodeCarbon, Pareto frontier analysis, and statistical energy modeling
🌿
Sustainability
Applying carbon-aware computing principles and grid intensity awareness to real-world AI infrastructure

Key references

Energy and Policy Considerations for Deep Learning in NLP
Strubell et al. (2019) · ACL · Foundation for AI carbon cost quantification
Efficiently Scaling Transformer Inference
Pope et al. (2022) · Google Research · Model efficiency and routing strategies
FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance
Chen et al. (2023) · Stanford · LLM cascade and routing inspiration
CodeCarbon: Estimating the Carbon Footprint of Computation
Lacoste et al. (2019) · NeurIPS Workshop · Energy benchmarking library used in framework
Mixture of Experts: A Survey
Cai et al. (2024) · arXiv · Sparse routing and model specialization
Carbon-Aware Computing for Datacenters
Wiesner et al. (2021) · IEEE · Grid-aware workload scheduling foundation

Open source. Open science.

GreenInfer is built to be shared. The framework code will be published on GitHub for any developer to integrate into their own AI applications.

Try the Chat Demo Read the Docs