Architecture

NVIDIA GPU Generations: Volta to Blackwell

NVIDIA has released five major data centre GPU generations: Volta (2017), Turing (2018), Ampere (2020), Hopper (2022), and Blackwell (2024). Each generation has delivered 2-3x performance improvements. According to Signwl data, all five generations are currently available in the cloud, from the V100 at ~$1.64/hr to the GB300 at ~$12.32/hr.

Volta (V100) — 2017

The V100 was the first GPU with Tensor Cores, introducing hardware-accelerated mixed-precision training. With 125 FP16 TFLOPS and 16-32GB of HBM2, it pioneered the use of GPUs for deep learning at scale. The V100 remains available at budget price points in the cloud.

Turing (T4) — 2018

Turing brought INT8 inference acceleration and the T4 — a compact, power-efficient GPU that became the default for inference workloads. At 65 FP16 TFLOPS and only 70W TDP, the T4 remains one of the most widely deployed inference GPUs globally.

Ampere (A100) — 2020

Ampere delivered a major performance leap with the A100 — 312 FP16 TFLOPS, up to 80GB of HBM2e, and Multi-Instance GPU (MIG) technology. The A100 was the dominant training GPU for three years and remains widely used for cost-effective training and inference.

Hopper (H100/H200) — 2022

Hopper introduced the Transformer Engine with FP8 precision support, delivering 990 FP16 TFLOPS — a 3.2x leap over the A100. The H100 became the de facto standard for AI training. The H200 variant added 141GB of HBM3e for memory-intensive inference workloads.

Blackwell (B200/GB200/GB300) — 2024

Blackwell is NVIDIA's current flagship architecture, delivering up to 2,250 FP16 TFLOPS with the GB300 and 288GB of HBM3e. The second-generation Transformer Engine supports FP4 precision, further accelerating transformer workloads. Blackwell GPUs are in early cloud deployment with availability expanding.

Frequently Asked Questions

Which NVIDIA GPU generation is the latest?

Blackwell (2024) is NVIDIA's latest data centre GPU generation, including the B200, GB200, and GB300. It delivers up to 2,250 FP16 TFLOPS and 288GB of HBM3e memory.

Is the A100 still relevant in 2026?

Yes. According to Signwl data, the A100 remains one of the most widely available and cost-effective GPUs. At ~$2.45/hr blended pricing with 312 FP16 TFLOPS, it offers strong value for training and inference workloads that don't require the latest generation.