NVIDIAGPU

NVIDIA T4

Name: NVIDIA T4
Brand: NVIDIA

Turing architecture · 16GB memory · 65 FP16 TFLOPS · 70W TDP

Cloud Pricing Today

About the NVIDIA T4

The NVIDIA T4 is one of the most widely available and affordable GPUs in the cloud, making it a popular entry point for AI inference workloads. Based on the Turing architecture with 16GB of GDDR6 memory, it delivers 65 FP16 TFLOPS while consuming only 70W — one of the lowest power draws of any data centre GPU.

Despite being a legacy architecture, the T4 remains relevant because of its exceptional availability and low cost. It is available in virtually every cloud region from all major providers, and its low power consumption makes it cost-effective for always-on inference services.

The T4 introduced INT8 and INT4 precision support for inference workloads, enabling quantised model serving that can significantly improve throughput. It can efficiently serve models up to approximately 7 billion parameters with appropriate quantisation.

For budget-conscious deployments, the T4 offers strong value. While newer GPUs like the L4 provide better performance, the T4's extremely low cost per hour makes it the go-to choice for lightweight inference, development environments, and cost-optimised serving.

Memory (VRAM)

16 GB

FP16 Performance

65 TFLOPS

Power (TDP)

70W

Architecture

Turing

Common Use Cases

Budget inferenceLightweight AI servingVideo transcodingEntry-level ML

Key Facts

Manufacturer: NVIDIA
Architecture: Turing
Accelerator Type: GPU
Primary Use: inference
Memory (VRAM): 16 GB
FP16 Performance: 65 TFLOPS
Thermal Design Power: 70W

Frequently Asked Questions

How much does a T4 GPU cost per hour?

The NVIDIA T4 is one of the most affordable cloud GPUs, with blended pricing typically between $0.15–$0.35 per hour. Spot pricing can drop below $0.10/hr in some regions.

What models can run on a T4?

The T4's 16GB of memory supports models up to approximately 7B parameters with quantisation (INT8/INT4). It can run inference on models like Llama 2 7B, Mistral 7B, and similar-sized models. Larger models require multi-GPU setups or higher-memory GPUs.

T4 vs L4 — which is better?

The L4 is the T4's successor, offering ~1.9x FP16 performance (121 vs 65 TFLOPS), 50% more memory (24GB vs 16GB), and similar low power consumption (72W vs 70W). The L4 is better for most inference workloads, but the T4 is cheaper and more widely available.

Related Accelerators

NVIDIA

NVIDIA GB300

Blackwell · 288GB · 2250 TFLOPS

training

NVIDIA

NVIDIA GB200

Blackwell · 192GB · 1800 TFLOPS

training

NVIDIA

NVIDIA B200

Blackwell · 192GB · 1800 TFLOPS

training

NVIDIA

NVIDIA B300

Blackwell · 288GB · 2250 TFLOPS

training

NVIDIA

NVIDIA H200

Hopper · 141GB · 990 TFLOPS

training

NVIDIA

NVIDIA H100

Hopper · 80GB · 990 TFLOPS

training

Compare NVIDIA T4

Compare

NVIDIA T4 vs L4

L4 delivers ~1.9x performance (121 vs 65 TFLOPS) with 50% more memory (24GB vs 16GB) at similar power. L4 is better for most workloads; T4 is cheaper and more widely available.

View full comparison →

Compare

NVIDIA T4 vs A10G

A10G delivers 1.9x the performance (125 vs 65 TFLOPS) with 50% more memory (24GB vs 16GB). A10G costs roughly 2x per hour but handles larger models and higher throughput.

View full comparison →

Investment Tool

Calculate NVIDIA T4 ROI

Estimate payback period, annual returns, and 3-year ROI with live Signwl pricing data.

→

Track NVIDIA T4 pricing over time

Get access to historical pricing data, regional analysis, and custom alerts.