Fundamentals

What is a TPU? Google's AI Accelerator Explained

A TPU (Tensor Processing Unit) is a custom AI accelerator designed by Google, purpose-built for machine learning workloads. Unlike GPUs, which are general-purpose parallel processors, TPUs use a systolic array architecture optimised specifically for the matrix operations that underpin neural networks. According to Signwl data, five TPU generations are currently available in the cloud, with pricing ranging from $0.09/hr (TPU v5 Lite) to $10.53/hr (TPU v7).

How TPUs Differ from GPUs

TPUs use a fundamentally different architecture to GPUs. While GPUs use thousands of general-purpose cores that can execute arbitrary parallel code (via CUDA or ROCm), TPUs use systolic arrays — fixed-function hardware specifically designed for matrix multiplication.

This specialisation means TPUs can be more efficient at their target workloads, but less versatile. TPUs excel at large-scale distributed training and inference for models built in JAX or TensorFlow, but don't support the breadth of software that NVIDIA's CUDA ecosystem offers.

TPU Generations

Google has released multiple TPU generations, each bringing significant performance improvements:

**TPU v2** — the first publicly available TPU, with 8GB of memory **TPU v3** — added liquid cooling and 16GB memory for larger models **TPU v5p** — designed for large-scale training with 95GB memory per chip **TPU v5 Lite** — optimised for cost-efficient inference **TPU v6e (Trillium)** — Google's latest generation with improved efficiency

Signwl tracks pricing for all available TPU generations across Google Cloud regions.

When to Consider TPUs

TPUs are most compelling for organisations that:

- Are committed to Google Cloud as their primary platform - Use JAX or TensorFlow as their primary framework - Need to scale training to thousands of accelerators (TPU pods) - Want cost-competitive alternatives to NVIDIA GPUs

For PyTorch-first organisations or those needing multi-cloud flexibility, NVIDIA GPUs remain the safer choice. However, PyTorch/XLA support for TPUs is improving.

Frequently Asked Questions

Is a TPU better than a GPU?

Neither is universally better. TPUs excel at large-scale training on Google Cloud with JAX/TensorFlow. GPUs (especially NVIDIA) offer broader software compatibility, multi-cloud flexibility, and more mature tooling. The best choice depends on your framework, cloud provider, and workload.

How much does a TPU cost?

According to Signwl's blended pricing data, TPU costs range from $0.09/hr (TPU v5 Lite Pod for inference) to $10.53/hr (TPU v7). TPU v5p, designed for large-scale training, averages around $1.07/hr.

Can I run PyTorch on a TPU?

Yes, through PyTorch/XLA. However, the experience is more mature on GPUs with CUDA. Some PyTorch operations may require adaptation for TPU compatibility, and debugging can be more complex.