First 100,000-GPU training cluster
2024-09xAI brought its Colossus cluster in Memphis online with ~100,000 H100 GPUs (~150 MW), built in months — the first single cluster at six-figure GPU scale.
Who is building the biggest AI compute?
The single most intuitive view — current position against the end goal, on a log scale.
Standings by actor, within this field only.
Clear-cut events: crossed or not crossed.
xAI brought its Colossus cluster in Memphis online with ~100,000 H100 GPUs (~150 MW), built in months — the first single cluster at six-figure GPU scale.
Combined annual capital spending by the four largest US hyperscalers crossed $200B, dominated by AI data centers — the fastest capex ramp in corporate history.
A $500B, ~10 GW US data-center program was announced, with the first multi-GW campus rising in Abilene, Texas — the largest compute buildout ever committed.
Colossus scaled to ~200,000 GPUs and gigawatt-class power as GB200 racks came online — the first single site to approach 1 GW of AI compute.
AWS activated Project Rainier — nearly 500,000 of its own Trainium2 chips across multiple US data centers, one of the world's largest AI compute clusters and the biggest built on custom silicon rather than Nvidia GPUs. Anthropic trains and serves Claude on it (>5x its previous training compute), scaling toward 1M+ Trainium2 chips — a proof point that frontier-scale compute can run on a hyperscaler's in-house accelerators.
Multiple operators are building toward a single cluster of ~1,000,000 GPUs and multi-GW campuses — under construction, not yet operational.
The finish line: a single campus delivering on the order of 10 GW of AI compute — the scale Stargate and rivals target around 2029, not yet here.
Every figure links to a primary source. We publish no invented scores. Tracker numbers are neutral; analysis is labelled separately.