NVIDIA GPU Energy Efficiency for AI
NVIDIA HGX B200 delivers up to 15× better energy efficiency than HGX H100, with 93% reduction in energy for the same inference workload. B200 shows 24% reduction in embodied carbon emissions across AI training and inference. H200 GPU offers improved efficiency via HBM3E memory (141 GB, 4.8 TB/s). A2 Tensor Core GPU: 40–60W TDP, 20× more inference performance than CPU-only, 10% better energy efficiency than prior GPU generations. Research on H100 shows batching and request shaping can reduce per-request energy up to 100×.
All claims above are supported by the sources listed below.