H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
H100$6.39/hr 1.2% 7d
A100 80GB$2.45/hr 0.5% 7d
H200$10.29/hr 0.8% 7d
L40S$1.28/hr 0.3% 7d
T4$0.24/hr 0.6% 7d
L4$0.45/hr 1.1% 7d
Comparison

NVIDIA L40S vs NVIDIA A10G

Mid-tier inference — Ada Lovelace vs Ampere

The L40S delivers 2.9x the FP16 performance of the A10G (366 vs 125 TFLOPS) with double the memory (48GB vs 24GB). Both are strong inference GPUs at different price points.

Pricing Comparison

Specifications

SpecificationNVIDIA L40SNVIDIA A10G
ManufacturerNVIDIANVIDIA
ArchitectureAda LovelaceAmpere
Accelerator TypeGPUGPU
Primary Useinferenceinference
Memory (VRAM)48 GB24 GB
FP16 Performance366 TFLOPS125 TFLOPS
TDP350W300W
Perf per Watt1.05 TFLOPS/W0.42 TFLOPS/W

Detailed Analysis

The L40S and A10G serve the mid-tier inference market but at different capability levels. The L40S, based on Ada Lovelace, offers a substantial performance uplift over the Ampere-based A10G.

The L40S's 48GB of GDDR6X memory enables it to handle models up to approximately 25B parameters with quantisation, while the A10G's 24GB limits it to roughly 13B. This memory advantage makes the L40S significantly more versatile for modern AI workloads.

The L40S also delivers 2.9x the FP16 TFLOPS (366 vs 125), meaning higher throughput for any given model. Its Ada Lovelace architecture includes hardware-accelerated ray tracing, making it a dual-purpose GPU for mixed AI and graphics workloads.

The A10G remains popular due to its lower cost and wide availability. For inference workloads that fit within 24GB of memory, the A10G can deliver acceptable performance at a significantly lower price point.

Verdict

Best for Training

L40S for fine-tuning tasks that need 48GB. A10G for small-scale fine-tuning within 24GB.

Best for Inference

L40S for larger models and higher throughput. A10G for cost-sensitive deployments with smaller models.

Best Value

A10G for workloads within its 24GB memory. L40S when you need the extra memory and throughput.

Frequently Asked Questions

Is the L40S good for inference?

Yes — the L40S is one of the best price/performance inference GPUs with 48GB memory and 366 FP16 TFLOPS. It can handle models up to ~25B parameters with quantisation.

View Individual Profiles

Related Comparisons

Need detailed pricing data?

Access historical trends, regional breakdowns, and custom analysis.