Frontier AI

The first fully interactive ARC benchmark: hand-built game environments with no instructions — agents must discover the rules. At launch every frontier model scored <1% (best 0.37%) while humans solve them all; $2M+ prize pool, results Dec 2026.

Press release · 2026-03

Frontier training compute passes 1e26 FLOP

2025

frontier labs

Largest models crossed 1e26 FLOP — a 10× jump over GPT-4, with compute still growing ~4–5× per year.

Database · 2025

Open reasoning model rivals the frontier — at a fraction of the cost

2025-01

DeepSeek (R1)

DeepSeek-R1, an openly released RL-trained reasoning model, matched leading closed models on math and coding — triggering a market reckoning over AI capex.

Paper · 2025-01

First frontier model shipped on domestic Chinese silicon

2026-04

DeepSeek / Huawei

DeepSeek's 1.6T-parameter V4 runs on Huawei Ascend (950PR), and a Huawei-led team completed full-parameter post-training on ~1,000 Ascend 910Cs — a compute-sovereignty landmark. Pre-training hardware remains undisclosed, so "trained without Nvidia" is NOT established.

Press release · 2026-04

OpenAI files confidentially for IPO

~ 2026-06

OpenAI

OpenAI confirmed a confidential S-1 draft with the SEC (8 Jun 2026) — last valued at $852B after a $122B round, with $25B+ annualized revenue. No timing set; reports point to a possible Sep–Nov window. Would be the defining AI listing.

Press release · 2026-06

A benchmark built to resist saturation: Humanity's Last Exam

2025-01

CAIS · Scale AI

As models saturated existing tests, a 2,500-question expert exam launched on which frontier models initially scored in the single digits — a fresh yardstick for the distance to general capability.

Paper · 2025-01

Reasoning & agentic coding becomes the frontier (Claude Opus 4)

2025-05

Anthropic (Claude Opus 4)

Anthropic's Claude Opus 4 launched with extended thinking and sustained autonomous coding over long tasks — part of a 2025 shift where reasoning/agentic models, not raw scale alone, drove the frontier.

Press release · 2025-05

Tiered safety deployment of a frontier model (Fable 5 / Mythos 5)

2026-06

Anthropic

Anthropic released Claude Fable 5 — a Mythos-class model exceeding any it had made generally available — gated so ~5% of sensitive (e.g. cyber) sessions get a conservatively-tuned model, while the unrestricted Mythos 5 went only to vetted cyberdefenders via Project Glasswing with the US government. Days later the US Commerce Department export-controlled both models, barring all foreign-national access; unable to enforce that selectively in real time, Anthropic shut Fable 5 and Mythos 5 off worldwide (its other models unaffected) — the first time a deployed frontier AI model was export-controlled like a strategic technology.

Press release · 2026-06