Market Pulse
The GPU compute market is undergoing a structural bifurcation between training and inference tiers that is more advanced than spot prices alone suggest. AWS Inferentia spot in Virginia has collapsed 90%+ from its January peak of $0.125/hr to a current $0.013/hr — the pricing signature of a commercially wound-down product — while Trainium spot in Ohio has undergone a deliberate step-change activation since late January, with prices rising from near-zero to a managed $0.178/hr range. This ASIC rotation is the clearest signal in the dataset. Overlaid on this, the Montreal H100 spot market's 90-day history reveals a full downward convergence cycle (from $7.46/hr in January to a $1.9/hr trough in mid-May, now re-inflating to $4.98/hr with a 197% spread) — a cautionary case study in how dangerous cross-sectional spot snapshots can be for procurement decisions. The dominant macro theme is power: GPU supply geography is visibly reorganizing around energy availability (GCP L4 emerging in Nevada and Iowa at near-zero spot prices), and operators who have secured power supply independence — Nebius's $2.6B, 10-year Bloom Energy deal for its Missouri gigawatt campus — are building a structural moat that grid-dependent competitors have not matched.
Key Movers
| Component | Region | Type | Price ($/hr) | 24h Δ | 7d Δ | 30d Δ | Flag |
|---|---|---|---|---|---|---|---|
| V100_32GB | Tokyo | SPOT | ~$2.10 | — | — | +390% | APAC capacity crunch; single provider, no spread — genuine scarcity signal |
| TPU_V6E | Tokyo | SPOT | — | — | — | +144% | Corroborates APAC crunch; even GCP ASIC spot being squeezed |
| H100 | Montreal | SPOT | $4.98 | — | — | +121% | Re-inflation from May trough; market oscillating, NOT converging upward to OD |
| A100_40GB | Montreal | SPOT | — | — | — | +125% | Same bifurcated provider dynamic as H100 Montreal |
| Trainium | Ohio | SPOT | $0.178 | +97% | — | +69% | AWS ASIC rotation signal — actively managed fleet; NOT noise |
| Inferentia | Virginia | SPOT | $0.013 | — | — | -61% | 97% collapse from Jan peak of $0.125/hr — end-of-life market clearing |
| Inferentia | Ohio | SPOT | — | — | — | -46% | Confirms US-wide Inferentia wind-down, not regional |
| Inferentia | Hong Kong | SPOT | — | — | — | -28% | APAC Inferentia also draining; global deprecation signal |
| L4 | Nevada | SPOT | $0.004–0.007 | — | +186% | New GCP regional pool activation — supply following power, not demand | |
| L4 | Iowa | SPOT | $0.004–0.007 | — | +137% | Same pattern; Iowa wind power enabling new capacity | |
| V520 | Frankfurt | SPOT | — | — | — | +187% | EU media GPU tightening; bifurcated from FPGA softness |
| VIRTEX FPGA | Frankfurt | SPOT | — | -31% | — | — | EU specialty compute softening; separate from GPU dynamics |
| T4G | Tokyo | SPOT | $0.0026 | — | +8,413% | NOISE — Near-zero base, single provider, percentage meaningless | |
| ALVEO_U30 | Oregon | SPOT | — | — | +33,050% | NOISE — Liquidity artifact from near-zero base; disregard entirely | |
| UNMAPPED vCPU/RAM | Various | Various | — | -99.99% | — | — | ARTIFACT — Pricing taxonomy reclassification; not a real price move |
Investable Insights
H1 — AWS Is Executing a Deliberate ASIC Transition: Inferentia Deprecation, Trainium2 Activation Confidence: 4 / 5
Thesis: The 90-day spot pricing histories for Inferentia and Trainium deliver the clearest structural signal in this dataset: AWS is commercially winding down Inferentia through the spot mechanism while simultaneously activating Trainium as a managed commercial product. This is not a supply fluctuation — it is the pricing fingerprint of a deliberate architectural transition, and the timeline is already well advanced. The market signal precedes any formal AWS product announcement, which is precisely what makes it actionable. Marvell, which architects Amazon's Trainium chips, is the most direct beneficiary of a confirmed Trainium2 ramp; its $11B AI ASIC revenue projection for 2026 hinges on AWS executing this transition at scale.
Key Evidence:
- Inferentia SPOT Virginia collapsed from $0.125/hr in early January 2026 to a trough of $0.003/hr by mid-February — a 97% collapse — before stabilizing in a low-volatility band currently at $0.013/hr (live pricing data, June 14, 2026). This is a scrap-rate clearing pattern, not normal market fluctuation.
- Inferentia on-demand pricing has remained absolutely frozen across all US regions and Hong Kong for 30+ days — zero movement — confirming AWS has stopped active commercial management of the platform. Spot and OD are completely decoupled.
- Trainium SPOT Ohio underwent a step-change activation around January 25, 2026: prices jumped from near-zero (~$0.004/hr) to $0.255/hr and peaked at $0.490/hr on February 21 before settling into a dynamically managed $0.10–0.178/hr range (live pricing history, June 14, 2026). This is active fleet management, not passive availability.
- Trainium SPOT in Melbourne simultaneously collapsed -82.4% over 30 days — consistent with AWS concentrating Trainium capacity in US home regions while letting APAC spot drain, a classic fleet rationalization move.
- Amazon Q1 2026 earnings disclosure confirmed: "Trainium 2 and Inferentia 3 chips deployed to shift workloads away from Nvidia H100/H200 infrastructure" (abhs.in, June 13, 2026) — direct management confirmation of the architectural pivot.
- CATALOG_SURVIVAL data shows Inferentia has been active in AWS catalog for 6.5 years — consistent with an end-of-life product whose catalog presence lags its commercial relevance.
- Broadcom and Marvell control approximately 95% of custom ASIC co-design, with ASICs projected to capture 27.8% of AI server compute market share in 2026, growing 44.6% YoY (AOL, June 14, 2026) — the structural tailwind for Trainium2 demand is real.
- Qualifying evidence: Trainium on-demand pricing is also frozen at $0.94/hr across Ohio, Oregon, and Virginia with zero 30-day delta. Until AWS cuts Inferentia on-demand pricing by 20–30% (the classic deprecation announcement move), the transition remains visible in spot but officially unacknowledged — reducing certainty on the timeline.
Implied Action:
- Long Marvell (MRVL) on Trainium2 ramp confirmation. Watch for new Trainium2 instance types appearing in the AWS EC2 pricing API in Virginia or Ohio — the moment spot pricing exists for Trainium2, it signals fleet inventory has reached meaningful scale. That is the entry trigger.
- Risk for Inferentia-dependent enterprises: Any workload built around Inferentia spot pricing at January 2026 rates no longer has an economic basis — the spot floor at $0.013/hr vs. the $0.05/hr on-demand basis of a year ago means this looked cheap and isn't. Migrate to Trainium or re-evaluate inference architecture now, before on-demand cuts make the platform formally end-of-life and support timelines compress.
- Monitor: AWS re:Invent preview announcements (typically September) for any Trainium2 general availability timeline. A GA announcement would accelerate the Inferentia clearing process and confirm the MRVL thesis.
H2 — Power-Sovereign Operators Will Bifurcate the GPU Cloud Market Within 18–24 Months Confidence: 4 / 5
Thesis: The binding constraint on GPU cloud supply has shifted from silicon to watts, and the operators who have secured power supply independent of the grid are building a structural margin moat that grid-dependent competitors cannot replicate on a 12–24 month horizon. NERC's large-load interconnection queue data (Northern Virginia interconnection timelines up to 14 years), ERCOT's receipt of 198 GW of new large-load applications in Q1 2026 alone against its ~90 GW total peak capacity, and ComEd's 12% residential rate hike (effective June 2026) all point to the same constraint: grid power for new data center capacity is either unavailable, prohibitively expensive, or politically contested. Nebius Group's $2.6B, 10-year Bloom Energy fuel cell deal for on-site generation at its Missouri gigawatt campus is not merely a financing event — it is the template for what "power-sovereign" infrastructure looks like. The GPU pricing data is already showing supply organizing around power geography rather than demand geography.
Key Evidence:
- GCP L4 spot instances emerged in Nevada at +186% over 7 days and Iowa at +137% over 7 days at near-zero implied prices ($0.004–0.007/hr) — new regional pool activations in states with abundant renewable power (Nevada solar, Iowa wind), not in response to co-located demand (live pricing data, June 14, 2026).
- B200 on-demand instances are live on AWS in Ohio, Virginia, and Oregon at $12.83–$14.60/hr, with
is_active: falseon Azure and GCP — AWS's B200 advantage maps precisely onto its US power-advantaged regions (live ticker data, June 14, 2026). - Nebius Q1 2026 revenue: $399M (6.8x YoY growth), 45% adjusted EBITDA margin, with 3.5 GW of contracted power capacity across Missouri and Pennsylvania sites (news feed, June 2026). The fuel cell deal eliminates grid interconnection risk at those sites entirely.
- Nebius (NBIS) equity: $232.36/share, up 393% year-on-year, PEG ratio of 0.63, trading 17% below its 52-week high of $278.84 — the market has begun pricing the power moat but the PEG suggests it is not yet fully reflected (equities data, June 14, 2026).
- Florida SB 484 (signed May 2026): utilities cannot pass data center power costs to residential customers; local governments gain data center veto power — state-level regulatory fragmentation creating site-selection headwinds for grid-dependent operators (intel feed, May 2026).
- IEA projection: data centers could consume 945 TWh by 2030 — a tripling from 2022 levels — making energy cost the primary variable in cloud GPU economics within 24 months.
- Qualifying evidence: Federal permitting reform remains a tail-risk denial signal. If Congress passes grid interconnection legislation that shortens queue timelines materially, the power-sovereign moat narrows. Legislative probability appears low in 2026, but a bipartisan energy bill cannot be ruled out.
- CoreWeave (CRWV) at $100.55/share, $35B total debt, -$4.7B quarterly free cash flow, with no disclosed power supply agreements (equities data, June 14, 2026) — the grid-dependent contrast case.
Implied Action:
- Long Nebius (NBIS): $232 with analyst mean target of $244 understates the power-infrastructure moat. The August 6 earnings report is the first validation event — watch for gross margin expansion above 50% and any disclosure of additional power contract terms. A confirmed margin expansion would be the re-rating catalyst toward the $278 52-week high and beyond.
- Long Bloom Energy (BE): Every gigawatt-scale AI factory choosing on-site fuel cells over grid connection is a direct BE revenue event. The Nebius deal is the proof-of-concept; replication by other neoclouds or hyperscalers is the upside catalyst.
- Short-side awareness: APLD (Applied Digital), IREN, and HUT (Hut 8) run at 4–6x beta with no disclosed power-sovereign infrastructure. As grid costs escalate and regulatory friction increases, their margin compression will be structural, not cyclical. Monitor their Q2 earnings for power cost disclosures.
H3 — APAC On-Demand H100 Carries a Real Structural Premium; Spot Framing Was Misleading Confidence: 3 / 5
Thesis: The APAC H100 spot market is volatile and cyclical — not monotonically tight — and the original structural scarcity narrative was a misread of cross-sectional data. Tokyo H100 spot peaked near $5.50/hr in early January 2026, compressed 47% to a trough of $2.93/hr by mid-March, and has since recovered to $2.67/hr; the +12% 30-day delta is a recovery from the trough, not evidence of accelerating scarcity. Seoul tells the same oscillating story. However, the on-demand tier tells a genuinely different story: Osaka on-demand H100 at $10.93/hr (the highest globally, essentially flat at -0.23% over 30 days), Tokyo at $8.81/hr, and Seoul at $8.63/hr — all with 23–37% on-demand spreads — reflect a market where enterprises are paying premium rates and providers are not discounting. The sovereign AI buildout macro (Nvidia/LG Korea partnership announced June 13, SK Telecom 2027 AI factory, Naver's GAK Sejong campus) is real and driving persistent on-demand demand, but the spot market is a poor proxy for that signal.
Key Evidence:
- Tokyo H100 SPOT 90-day history: $5.44–5.54/hr on January 8–11 (single provider), dropped to $4.25/hr on January 12 with 88% spread (second provider entry), compressed to a trough of $2.93/hr by March 12 — a 47% collapse from peak (pricing history, June 14, 2026).
- Seoul H100 SPOT hit a trough near $1.06/hr in February (extreme spread indicating one provider near zero) before recovering to the current $2.57/hr — a 142% recovery from trough, not a trend of persistent tightening (pricing history, June 14, 2026).
- Osaka H100 on-demand: $10.93/hr, -0.23% over 30 days — the highest in the global dataset and essentially price-inelastic, confirming real on-demand demand pressure (live ticker data, June 14, 2026).
- Tokyo on-demand spread has narrowed from 59% in January to 21% today — providers converging, not diverging, suggesting the market is moving toward equilibrium at a high price level, not toward a crisis (pricing history, June 14, 2026).
- Nvidia/LG Group announced Korean data center partnership (Yahoo Finance, June 13, 2026); SK Telecom 2027 AI factory plans confirmed — sovereign AI buildout consuming existing cloud capacity.
- Qualifying evidence: H100 spot discounts in Tokyo (~70% below OD) remain in line with US levels — the spot market is not uniquely suppressed. The APAC premium is an on-demand story, not a spot story.
Implied Action:
- Procurement: Enterprises requiring APAC H100 capacity should lock 1-year on-demand reserved contracts in Tokyo ($8.36–8.90/hr range) rather than relying on spot. Spot availability is cyclical — the January peak to March trough shows a 47% swing in 60 days — and APAC reserved rates at current levels are likely the floor before sovereign demand further constrains supply.
- Investment: Neoclouds with APAC GPU inventory have real on-demand pricing power. The spot scarcity framing is too volatile to trade directly; the OD premium is the durable signal. Watch for any GCP Tokyo capacity expansion announcements — new supply would compress the premium fastest.
- Avoid: Treating APAC spot price moves as a directional trend signal without 90-day context. The January–March compression proves these markets oscillate sharply.
H4 — H100 On-Demand Compression via B200 Scaling Is Real but 12–18 Months Away Confidence: 3 / 5
Thesis: B200 availability is real — AWS has is_active: true for B200 on-demand at $12.83–14.60/hr in Ohio, Virginia, and Oregon — but it remains a single-provider introduction at a 35–68% premium to H100 on-demand, making it a complement rather than a substitute at current availability. Azure and GCP both show B200 as is_active: false, meaning the competition that would force H100 pricing down has not yet materialized. H100 on-demand is essentially stationary (Virginia at -0.003% over 30 days, Osaka at -0.23%), confirming no current compression. However, the directional thesis is correct on a longer horizon: when Azure and GCP bring B200 live, the three-way provider competition at the Blackwell tier will pull premium demand away from H100, and H100 on-demand prices will begin the same slow decline H200 is already experiencing (-0.03% to -0.56% per 30 days across all global regions). The procurement implication — avoid 3-year H100 reserved commitments — is sound even at 3/5 confidence, because the downside of being wrong is a 3-year lock at today's high rates.
Key Evidence:
- B200 on-demand live on AWS only at $12.83/hr (Ohio/Virginia) and $14.60/hr (Oregon), with
is_active: falseon Azure and GCP — a meaningful competitive asymmetry that limits near-term H100 pressure (live ticker data, June 14, 2026). - H100 on-demand Virginia: $8.69/hr, -0.003% over 30 days — essentially zero movement, confirming no current B200 substitution pressure (live ticker data, June 14, 2026).
- H200 on-demand declining at -0.03% to -0.56% per 30 days across all 10 observed regions globally — the current-generation premium is already commoditizing, establishing the trajectory for H100 (live ticker data, June 14, 2026).
- A100 40GB and 80GB remain
is_active: trueon AWS, Azure, and GCP after 6.08 and 5.57 years respectively (CATALOG_SURVIVAL data, June 14, 2026) — GPU generations do not exit clouds quickly, reducing urgency of compression thesis. - Nvidia Q2 FY2027 guidance: $91B revenue — continued Blackwell volume production implies supply ramp, which is the prerequisite for Azure/GCP B200 activation (Yahoo Finance, June 14, 2026).
- CoreWeave (CRWV): $100.55/share, $35B total debt, collateralized largely by H100 clusters, down 9.7% over 30 days (equities data, June 14, 2026) — the equity most exposed to H100 replacement value compression.
- Qualifying evidence: HBM production fully sold out through 2026 (Micron disclosure, June 2026) — HBM scarcity constrains B200 supply ramp, potentially delaying the Azure/GCP activation timeline beyond current estimates.
Implied Action:
- Procurement: Strongly prefer 1-year H100 reserved contracts over 3-year. The rate premium for 1-year (~$7–8/hr in major US regions vs. $2.50–3.50/hr for 3-year) is real, but optionality as B200 scales is worth that premium. Committing to 3-year H100 reservations today means being on a depreciating platform at above-market rates from approximately 2027 onward.
- Investor risk: CoreWeave's H100-collateralized debt load is the primary covenant risk as replacement value compresses. Watch the Q2 2026 CRWV earnings (August 12) for any disclosure of collateral valuation methodology — if they begin marking H100 clusters below cost, it is the first sign of debt-to-collateral deterioration.
- Trigger to accelerate timeline: B200 spot instances appearing in the AWS EC2 pricing API in Virginia or Ohio would signal fleet inventory has reached scale — at that point, Azure/GCP activation follows within 1–2 quarters and the H100 compression thesis becomes near-term.
Get the Weekly Pulse
This analysis is part of Signwl's weekly research. Subscribe free — Tuesday delivery, no spam.
Risk Flags
Immediate (0–30 Days)
R1 — Montreal H100 Spot: Procurement Whipsaw Risk The Montreal H100 spot market is in a bifurcated re-inflation phase: one provider at ~$7.45/hr (near on-demand), one at ~$2.51/hr (distressed), with a 197% spread. Enterprises treating the $2.51/hr tier as stable, recurring spot access are exposed to sudden pool exhaustion — exactly what happened to the expensive provider in late April when it collapsed from $7.46/hr to $2.52/hr in under three weeks. The 90-day history shows this market oscillates between full bifurcation and full convergence in cycles of approximately 6–8 weeks. Any procurement decision based on the current spot rate without accounting for cycle position risks a 3x price step-up on zero notice. Watch: Spread compression below 100% as the signal that the cheap pool is draining.
R2 — Inferentia Enterprise Exposure: Hidden Cliff Risk Enterprises on Inferentia spot at January 2026 rates ($0.10–0.125/hr) who have not yet migrated face an exposure they may not have modeled: the spot floor has already collapsed 90%, and on-demand price cuts (the formal deprecation signal) could arrive in the next 30–60 days. When AWS formally announces Inferentia end-of-life or on-demand price cuts, spot SLA commitments from AWS may also shift. Any workload with implicit Inferentia spot price assumptions embedded in a business model or customer contract needs to be identified and triaged now.
R3 — APAC Spot Volatility: Directional Misread Amplification The Tokyo and Seoul H100 spot histories show 40–50% swings in 60-day windows. Any automated procurement or spot-bidding system with APAC H100 exposure that anchored on January peak pricing ($5.50/hr) without adjusting for the March trough ($2.93/hr) would have materially over-provisioned reserves. Systems using 30-day rolling averages in APAC without a longer baseline are systematically misreading market position. Action: Extend spot pricing lookback to 90 days minimum for APAC H100 bid strategies.
Near-Term (30–90 Days)
R4 — ComEd Rate Hike Cascading to Northern Illinois Data Centers ComEd's 12% residential rate hike (effective June 2026) in northern Illinois is a leading indicator of a broader utility cost-pass-through trend. Northern Illinois hosts significant data center density (primarily Microsoft Azure's Chicago region cluster). If utilities across PJM begin reclassifying data center loads and raising industrial rates to recover grid upgrade costs — a dynamic already occurring in Virginia and Illinois — on-site power costs for grid-dependent operators will rise 10–15% within 90 days in affected regions. This directly compresses margins at any neocloud relying on grid power in PJM territory. Watch: Illinois Commerce Commission filings for follow-on rate cases and any Microsoft/AWS Ohio-region capacity announcements that signal migration away from Illinois infrastructure.
R5 — Broadcom/Google TPU Locked-In Supply: H100 Demand Demand Destruction Broadcom's locked-in TPU supply agreement with Alphabet through 2031 and the broader ASIC market capturing 27.8% of AI server compute in 2026 creates an underappreciated near-term demand destruction risk for merchant H100 supply. If major AI labs (Anthropic's 3.5 GW TPU commitment, Google's internal TPU deployment) pull workloads away from H100 at an accelerating rate, the H100 on-demand price floor — currently sticky at $8.69/hr in Virginia — could soften faster than the B200-driven compression thesis suggests. This is a demand-pull scenario, not a supply-push one, and would show up first as widening H100 on-demand spreads as providers compete for a smaller buyer pool.
R6 — HBM Scarcity as a Hidden GPU Supply Ceiling Micron's HBM production is fully sold out through 2026. This creates a physical ceiling on B200/H200 supply expansion that neither AWS, Azure, nor GCP can work around unilaterally. If HBM allocation constraints delay Azure and GCP B200 activation beyond Q4 2026, the H100 compression thesis timeline extends to 24+ months — but it also means H100 supply is not growing either, supporting current on-demand price floors. The HBM constraint is simultaneously a buffer for H100 incumbents and a cap on next-gen ramp. Watch: Micron and SK Hynix HBM capacity announcements in Q3 2026 earnings for any indication of supply loosening.
Structural (6–24 Months)
R7 — Grid Interconnection as a 14-Year Structural Supply Ceiling FERC's 2025 State of Markets report and NERC reliability guidelines confirm what the intel feed has been signaling: Northern Virginia grid interconnection timelines of up to 14 years, PJM facing a 15 GW capacity shortfall by 2030, and ERCOT receiving 198 GW of new large-load applications in Q1 2026 alone against ~90 GW total peak capacity. These are not near-term risks — they are structural ceilings on where GPU supply can be built and at what cost. Any GPU cloud capacity plan predicated on grid-connected data centers in the US Northeast or Texas faces a timeline risk that makes 18-month supply projections unreliable. The GPU pricing implication: on-demand prices in power-constrained regions (Virginia, Ohio) will face structural upward pressure as supply additions slow, while power-abundant regions (Iowa, Nevada, Missouri) will see increasing new supply — exactly what the GCP L4 Nevada/Iowa data is already showing.
R8 — EU Regulatory Bifurcation: Two GPU Markets Emerging The European Commission is simultaneously pushing AI gigafactory mandates (via the Cloud and AI Development Act) while individual EU member states cancel hyperscale projects in favor of smaller regional hubs with green mandates. The Frankfurt GPU market already shows bifurcation: V520 (media GPU) SPOT up +187% over 30 days while VIRTEX FPGAs correct -31% over 7 days — different compute classes moving in opposite directions within the same region. If EU data sovereignty rules (the EU AI Act's compute governance provisions) increasingly restrict cross-border GPU workloads, EU on-demand pricing will decouple from global benchmarks and establish a persistent regional premium. Watch: EU Cloud and AI Development Act implementation timeline and any German or French data localization mandates in H2 2026.
R9 — Custom ASIC Substitution: Structural H100 Demand Erosion Beyond B200 The structural demand destruction risk for merchant GPU capacity is not B200 (which replaces H100 with a better Nvidia GPU) but custom ASICs (which replace Nvidia entirely). Broadcom and Marvell's 95% share of ASIC co-design, ASICs taking 27.8% of AI server compute share in 2026 growing at 44.6% YoY, and the Inferentia/Trainium rotation showing that even a hyperscaler as GPU-invested as AWS is moving workloads off Nvidia — these are early signals of a multi-year substitution dynamic. The H100 collateral risk at CoreWeave ($35B debt collateralized by H100 clusters) is not just about B200 cannibalization; it is about whether merchant GPU capacity retains pricing power in a world where hyperscalers increasingly train and infer on in-house silicon. Watch: Any announcements of Microsoft's Maia 2 or Meta's MTIA deployment at scale — at the point where hyperscaler in-house silicon displaces GPU cloud purchasing, the merchant GPU market loses its price-anchoring demand.