H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
AWSGPU

AWS Inferentia

Inferentia architecture · 8GB memory

Cloud Pricing Today

About the AWS Inferentia

AWS Inferentia is a custom machine learning inference chip designed by Amazon. It is optimised for high-throughput, low-cost inference deployments within the cloud ecosystem.

Memory (VRAM)
8 GB
Architecture
Inferentia

Common Use Cases

High-throughput inferenceCost-optimised ML servingProduction deployment

Key Facts

Manufacturer
AWS
Architecture
Inferentia
Accelerator Type
GPU
Primary Use
inference
Memory (VRAM)
8 GB

Related Accelerators

Track AWS Inferentia pricing over time

Get access to historical pricing data, regional analysis, and custom alerts.