EU AI Act: High-risk obligations begin Aug 2026Check your compliance →

GHG Emissions Simulator for Generative AI

This simulator estimates the greenhouse gas emissions (CO₂e) generated by generative AI systems, using the TokenFlop modeling method developed by Digital4Better. It covers training and inference phases, accounts for hardware manufacturing and usage footprints, and supports text, image, audio, and video modalities.

The TokenFlop method comes from the Data4Impact research program, which won the BPI/ADEME innovation competition, conducted by Digital4Better to develop rigorous tools for evaluating the environmental impact of digital technology.

Read the full TokenFlop methodology
1
Token
Universal unit: ~4 characters for text, spatial patches for images, temporal frames for video/audio.
2
FLOPs
Compute load estimated by use case (6×P for training, 2×P×tokens for inference).
3
GPU time
FLOPs ÷ (GPU capacity × MFU). MFU between 25% and 50%, default 40%.
4
Energy
GPU power × time × PUE (data center efficiency, default 1.2).
5
Carbon
Energy × regional emission factor + hardware manufacturing amortized over 5 years.

AI Impact Simulator

Estimate the environmental footprint of your AI usage in real time.


0.0e+0
kg CO₂e / month
0.0e+0
kWh / month
0.0M
tokens / month

0% operational, 100% embodied (manufacturing carbon).

Results are orders of magnitude from theoretical modeling based on publicly available data. They do not constitute a direct measurement of actual emissions. Results depend on input parameters and assumptions; consult the methodology for scope and limitations.

TokenFlop Methodology

Bottom-up modeling: estimating compute load (FLOPs) from model usage, converting to GPU time, then to energy consumption and GHG emissions. Integrates manufacturing footprint following LCA logic (ISO 14040 / ITU L.1410).[3][4]

1

Base unit and input data

The base unit is the token — a discrete unit the model manipulates for input/output. Depending on the modality, a token can be a word fragment, a spatial position, or a coded temporal unit.

ModalityWhat is a tokenExample
TextWord fragment (3-4 characters average)1,000 tokens ≈ 750 words in English
ImageSpatial patch (e.g. 16×16 px)
AudioTemporal token (codec, e.g. EnCodec)
VideoSpatial token per frame × frame count
2

Compute load estimation (FLOPs)

Computational load is estimated by usage phase:[1]

PhaseFormula
Training
Fine-tuning
Inference — prompt
Inference — text generation
Image generation
Video generation

Inference assumption: systematic KV cache presence, reducing prompt cost to ~1 FLOP per parameter/token.

3

Conversion to GPU time (GPUh)

FLOPs are converted to effective processing time:

  • : Theoretical GPU capacity in FLOP/h (e.g. 989 TFLOPS FP8 for an H100)
  • MFU : Model FLOP Utilization — percentage of theoretical capacity actually usable, estimated between 25% and 50%. Default value: 40%.[8]
4

Energy consumption conversion

GPU time is translated into energy consumed:

  • : GPU power in watts (e.g. 700 W for an H100)
  • PUE : Power Usage Effectiveness — data center energy efficiency. Default value: 1.2
5

Operational environmental impact

Energy is converted to GHG emissions via the regional electricity emission factor:

  • : emission factor by region, from the Digital4Better open data repository (e.g. 0.420 kgCO₂e/kWh for the US, 0.040 kgCO₂e/kWh for France).[6]
6

Manufacturing impact (embodied footprint)

Hardware manufacturing footprint is allocated proportionally to usage time:

Default lifespan: 5 years. Non-GPU server components (CPU, RAM, storage, chassis) are distributed proportionally to GPU count per server, following LCA logic (ISO 14040 / ITU L.1410).[3][4]

7

Validation — Llama 3.1 405B

For consistency verification, TokenFlop was applied to the open-source Llama 3.1 model (405B parameters), trained on ~15 trillion tokens with 24,576 H100 GPUs:

ModelEstimated GPU timeEstimated emissions
Llama 3.1 8B1,46 M GPUh~420 tCO₂e
Llama 3.1 70B7,0 M GPUh~2 040 tCO₂e
Llama 3.1 405B30,84 M GPUh~8 930 tCO₂e

Deviation from Hugging Face data: < 2%, validating the modeling coherence. For inference, with a 400-token average prompt on Llama 3.1 405B: ~0.1 gCO₂e per request.[5]

!

Assumptions and limitations

Results are estimates from theoretical modeling and do not constitute direct measurement of actual emissions. Main sources of uncertainty:

  • Actual model characteristics often confidential (training data, effective MFU, hidden dimensions count).
  • Lack of reliable LCA data on certain AI-specific equipment.
  • TPU, FPGA, and ASIC specificities are not accounted for.
  • Model-to-hardware memory adequacy is not verified.

The method is suited for relative scenario comparison, project framing, and prospective evaluation — not for certified emissions reporting.

Bibliography

  1. Schwartz, R., et al. (2020). Green AI. Communications of the ACM. arXiv: 1907.10597
  2. IEA (2024). Energy and AI.
  3. ISO 14040/14044. Environmental management — Life Cycle Assessment.
  4. ITU L.1410. Methodology for the assessment of the environmental life cycle impact of ICT goods, networks and services.
  5. Meta (2024). The Llama 3 Herd of Models. arXiv: 2407.21783
  6. Digital4Better. Open Data Repository. digital4better.github.io/data
  7. Digital4Better. Open Methodology for Generative AI. digital4better.github.io/methodology/ai
  8. NVIDIA (2025). Llama 3.1 70B DGXC Benchmarking.

Go further

Read the complete methodology with detailed formulas, assumptions, and sources.

Trusted by

France 2030ADEMEGouvernement FrançaisCluster SequoiaBpifranceHub France IAImages & Réseaux