Explainer 2026-04-10

Why we don't score "AGI"

There is no agreed test for general intelligence, so a single "AGI %" would be our opinion dressed as data. Instead we track objective, third-party numbers: training compute, public benchmark scores, and investment.

It is tempting to put a single number on how close AI is to "general intelligence," but there is no agreed definition of AGI and no accepted test for it. Any single "AGI score" would therefore be a judgment call — our opinion dressed up as a measurement — which is exactly what this tracker refuses to do.

So for frontier AI we do what we do everywhere: track objective, externally-sourced numbers and let them speak. Training compute (from Epoch AI) captures the raw scaling that has driven most capability gains, and it is a hard, auditable figure. Public benchmarks like ARC-AGI measure specific skills against a human baseline, run by independent third parties — we report their scores, we do not invent them. Private investment (from Stanford's AI Index) shows how hard capital is betting.

None of these is "intelligence," and we are careful to say so. Compute is an input, not an outcome; a benchmark is a narrow slice; investment is belief, not capability. But together they let you watch the field move without us putting a thumb on the scale. When a number crosses a human baseline — as ARC-AGI did in late 2024, after which a harder ARC-AGI-2 reset the bar in 2025 — that is a fact with a date and a source, not a vibe. That is the standard we hold the whole tracker to, and frontier AI, the noisiest field of all, is where it matters most.