H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
NVIDIAGPU

NVIDIA H200

Hopper architecture · 141GB memory · 990 FP16 TFLOPS · 700W TDP

Cloud Pricing Today

About the NVIDIA H200

The NVIDIA H200 is an evolution of the H100 that addresses one of its key limitations: memory capacity. By upgrading from 80GB HBM3 to 141GB HBM3e, the H200 provides 76% more memory while maintaining the same 990 FP16 TFLOPS of compute performance.

This memory upgrade is particularly significant for large language model inference, where the entire model must fit in GPU memory for efficient serving. Models that required multi-GPU setups on the H100 can run on fewer H200 GPUs, reducing both cost and latency.

The H200 also benefits from higher memory bandwidth — 4.8 TB/s compared to the H100's 3.35 TB/s — which improves performance on memory-bandwidth-bound operations common in transformer inference. For training workloads, the H200 offers modest improvements through better memory utilisation, though the compute performance remains identical to the H100.

In the cloud market, the H200 commands a premium over the H100 but can deliver better total cost of ownership for memory-intensive workloads by reducing the number of GPUs required.

Memory (VRAM)
141 GB
FP16 Performance
990 TFLOPS
Power (TDP)
700W
Architecture
Hopper

Common Use Cases

LLM inference at scaleLarge model trainingMemory-intensive AI workloadsGenerative AI serving

Key Facts

Manufacturer
NVIDIA
Architecture
Hopper
Accelerator Type
GPU
Primary Use
training
Memory (VRAM)
141 GB
FP16 Performance
990 TFLOPS
Thermal Design Power
700W

Frequently Asked Questions

How much does an H200 cost per hour?

The NVIDIA H200 blended cloud pricing typically ranges from $8–$14 per hour depending on region and pricing model. It commands a significant premium over the H100 due to its higher memory capacity and bandwidth.

What is the difference between H100 and H200?

The H200 has 76% more memory (141GB vs 80GB) and 43% higher memory bandwidth (4.8 vs 3.35 TB/s) compared to the H100. Compute performance (990 FP16 TFLOPS) is identical. The H200 excels at memory-intensive workloads like large model inference.

Is the H200 better than the H100 for inference?

Yes, for large model inference the H200 is generally better due to its larger memory (141GB vs 80GB) and higher memory bandwidth. This allows it to serve larger models on fewer GPUs, reducing overall inference cost and latency.

Related Accelerators

Compare NVIDIA H200

Investment Tool

Calculate NVIDIA H200 ROI

Estimate payback period, annual returns, and 3-year ROI with live Signwl pricing data.

Track NVIDIA H200 pricing over time

Get access to historical pricing data, regional analysis, and custom alerts.