Fundamentals

What is a GPU?

A GPU (Graphics Processing Unit) is a specialised processor designed for parallel computation. Originally built for rendering graphics, GPUs are now the dominant hardware for training and running artificial intelligence models. According to Signwl data, 39 different GPU and AI accelerator types are currently available across major cloud providers, with blended pricing ranging from under $0.10/hr for budget inference chips to over $17/hr for frontier training accelerators.

How GPUs Work

Unlike CPUs, which are optimised for sequential processing with a few powerful cores, GPUs contain thousands of smaller cores designed to execute many operations simultaneously. This parallel architecture makes them ideal for the matrix multiplication operations that underpin neural network training and inference.

A modern data centre GPU like the NVIDIA H100 contains thousands of CUDA cores alongside specialised Tensor Cores designed specifically for AI workloads. These Tensor Cores can perform mixed-precision matrix operations at extremely high throughput — the H100 delivers 990 FP16 TFLOPS (trillions of floating-point operations per second).

GPUs in Cloud Computing

Cloud providers offer GPU instances that allow organisations to rent GPU compute by the hour rather than purchasing physical hardware. This has democratised access to AI compute — a researcher can rent an H100 for a few dollars per hour rather than investing tens of thousands in hardware.

Cloud GPU pricing varies significantly by GPU type, region, and pricing model (spot, on-demand, or reserved). Signwl tracks these prices daily across all major cloud providers to provide a comprehensive view of the AI compute cost landscape.

GPU vs Other AI Accelerators

While GPUs dominate AI compute, alternative accelerators exist. Google's TPUs (Tensor Processing Units) use a fundamentally different architecture optimised specifically for matrix operations. AWS offers custom chips like Trainium (for training) and Inferentia (for inference). AMD's Instinct MI300X competes directly with NVIDIA's GPUs.

Signwl tracks pricing across all these accelerator types, providing a unified view of AI compute costs regardless of the underlying hardware.

Training vs Inference GPUs

The GPU market is broadly divided into training-class and inference-class accelerators. Training GPUs like the H100, A100, and MI300X prioritise raw compute throughput and memory bandwidth for the computationally intensive process of teaching AI models. Inference GPUs like the T4, L4, and L40S prioritise cost efficiency and power efficiency for serving trained models in production.

According to Signwl data, training-class GPUs typically cost 3-10x more per hour than inference-class GPUs, reflecting their higher performance and capability.

Frequently Asked Questions

What is a GPU used for in AI?

GPUs are used for both training AI models (teaching them from data) and running trained models (inference). Their parallel processing architecture makes them dramatically faster than CPUs for the matrix operations that underpin neural networks. According to Signwl data, 39 GPU and accelerator types are currently available in the cloud for AI workloads.

How much does a GPU cost in the cloud?

Cloud GPU pricing varies widely by type. According to Signwl's blended pricing data (averaging spot, on-demand, and reserved), costs range from under $0.10/hr for budget inference GPUs like the T4 to over $17/hr for frontier training accelerators like the GB200. The NVIDIA H100, the most widely used training GPU, averages around $6/hr.

What is the difference between a GPU and a CPU?

CPUs have a few powerful cores optimised for sequential tasks. GPUs have thousands of smaller cores optimised for parallel tasks. This makes GPUs dramatically faster for AI workloads, which involve massive parallel matrix multiplications. A single H100 GPU delivers 990 FP16 TFLOPS compared to single-digit TFLOPS on a CPU.