Frontier AI
Evergreen explainer · neutral data, no opinion

Abstract-reasoning score (ARC-AGI)

Current verified value
88%
OpenAI o3 · ARC-AGI-1, high-compute (2024-12)
Press release · 2024-12 Human-verified

What it measures

Score on ARC-AGI-1 — puzzles easy for humans (~85%) but long resistant to AI. In Dec 2024 OpenAI's o3 reached 76–88%, the first AI to move beyond memorization on this test. ARC-AGI-2 is the harder successor, where frontier models still score low. (A third-party benchmark, not our score.)

Glossary →

History

98.56%71.28%44.00%16.72%-10.56%GPT-3 era (~0%)20202024-12

By actor

OpenAIo3, high-compute (2024)88%

Goal / baseline

now88%
goal~85% (human)
baselineGPT-3 era (~0%)

Average human performance on ARC-AGI-1 — the bar AI crossed in late 2024.

Related reading