GHG Emissions Simulator for Generative AI
This simulator estimates the greenhouse gas emissions (CO₂e) generated by generative AI systems, using the TokenFlop modeling method developed by Digital4Better. It covers training and inference phases, accounts for hardware manufacturing and usage footprints, and supports text, image, audio, and video modalities.
The TokenFlop method comes from the Data4Impact research program, which won the BPI/ADEME innovation competition, conducted by Digital4Better to develop rigorous tools for evaluating the environmental impact of digital technology.
AI Impact Simulator
Estimate the environmental footprint of your AI usage in real time.
0% operational, 100% embodied (manufacturing carbon).
TokenFlop Methodology
Bottom-up modeling: estimating compute load (FLOPs) from model usage, converting to GPU time, then to energy consumption and GHG emissions. Integrates manufacturing footprint following LCA logic (ISO 14040 / ITU L.1410).[3][4]
Base unit and input data
The base unit is the token — a discrete unit the model manipulates for input/output. Depending on the modality, a token can be a word fragment, a spatial position, or a coded temporal unit.
| Modality | What is a token | Example |
|---|---|---|
| Text | Word fragment (3-4 characters average) | 1,000 tokens ≈ 750 words in English |
| Image | Spatial patch (e.g. 16×16 px) | |
| Audio | Temporal token (codec, e.g. EnCodec) | |
| Video | Spatial token per frame × frame count |
Compute load estimation (FLOPs)
Computational load is estimated by usage phase:[1]
| Phase | Formula |
|---|---|
| Training | |
| Fine-tuning | |
| Inference — prompt | |
| Inference — text generation | |
| Image generation | |
| Video generation |
Inference assumption: systematic KV cache presence, reducing prompt cost to ~1 FLOP per parameter/token.
Conversion to GPU time (GPUh)
FLOPs are converted to effective processing time:
- : Theoretical GPU capacity in FLOP/h (e.g. 989 TFLOPS FP8 for an H100)
- MFU : Model FLOP Utilization — percentage of theoretical capacity actually usable, estimated between 25% and 50%. Default value: 40%.[8]
Energy consumption conversion
GPU time is translated into energy consumed:
- : GPU power in watts (e.g. 700 W for an H100)
- PUE : Power Usage Effectiveness — data center energy efficiency. Default value: 1.2
Operational environmental impact
Energy is converted to GHG emissions via the regional electricity emission factor:
- : emission factor by region, from the Digital4Better open data repository (e.g. 0.420 kgCO₂e/kWh for the US, 0.040 kgCO₂e/kWh for France).[6]
Manufacturing impact (embodied footprint)
Hardware manufacturing footprint is allocated proportionally to usage time:
Default lifespan: 5 years. Non-GPU server components (CPU, RAM, storage, chassis) are distributed proportionally to GPU count per server, following LCA logic (ISO 14040 / ITU L.1410).[3][4]
Validation — Llama 3.1 405B
For consistency verification, TokenFlop was applied to the open-source Llama 3.1 model (405B parameters), trained on ~15 trillion tokens with 24,576 H100 GPUs:
| Model | Estimated GPU time | Estimated emissions |
|---|---|---|
| Llama 3.1 8B | 1,46 M GPUh | ~420 tCO₂e |
| Llama 3.1 70B | 7,0 M GPUh | ~2 040 tCO₂e |
| Llama 3.1 405B | 30,84 M GPUh | ~8 930 tCO₂e |
Deviation from Hugging Face data: < 2%, validating the modeling coherence. For inference, with a 400-token average prompt on Llama 3.1 405B: ~0.1 gCO₂e per request.[5]
Assumptions and limitations
Results are estimates from theoretical modeling and do not constitute direct measurement of actual emissions. Main sources of uncertainty:
- Actual model characteristics often confidential (training data, effective MFU, hidden dimensions count).
- Lack of reliable LCA data on certain AI-specific equipment.
- TPU, FPGA, and ASIC specificities are not accounted for.
- Model-to-hardware memory adequacy is not verified.
The method is suited for relative scenario comparison, project framing, and prospective evaluation — not for certified emissions reporting.
Bibliography
- Schwartz, R., et al. (2020). Green AI. Communications of the ACM. arXiv: 1907.10597
- IEA (2024). Energy and AI.
- ISO 14040/14044. Environmental management — Life Cycle Assessment.
- ITU L.1410. Methodology for the assessment of the environmental life cycle impact of ICT goods, networks and services.
- Meta (2024). The Llama 3 Herd of Models. arXiv: 2407.21783
- Digital4Better. Open Data Repository. digital4better.github.io/data
- Digital4Better. Open Methodology for Generative AI. digital4better.github.io/methodology/ai
- NVIDIA (2025). Llama 3.1 70B DGXC Benchmarking.


