H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
AWSGPU

AWS Inferentia2

Inferentia2 architecture · 32GB memory

Cloud Pricing Today

About the AWS Inferentia2

AWS Inferentia2 is the second generation of Amazon's custom inference chip, offering significantly improved performance and support for larger models compared to the original Inferentia.

Memory (VRAM)
32 GB
Architecture
Inferentia2

Common Use Cases

Large model inferenceGenerative AI servingCost-effective inference

Key Facts

Manufacturer
AWS
Architecture
Inferentia2
Accelerator Type
GPU
Primary Use
inference
Memory (VRAM)
32 GB

Related Accelerators

Track AWS Inferentia2 pricing over time

Get access to historical pricing data, regional analysis, and custom alerts.