Evergreen explainer · neutral data, no opinion

Abstract-reasoning score (ARC-AGI)

Current verified value

88%

OpenAI o3 · ARC-AGI-1, high-compute (2024-12)

Press release · 2024-12 Human-verified

What it measures

Score on ARC-AGI-1 — puzzles easy for humans (~85%) but long resistant to AI. In Dec 2024 OpenAI's o3 reached 76–88%, the first AI to move beyond memorization on this test. ARC-AGI-2 is the harder successor, where frontier models still score low. (A third-party benchmark, not our score.)

Glossary →

History

By actor

OpenAIo3, high-compute (2024)88%

Goal / baseline

now88%

goal~85% (human)

baselineGPT-3 era (~0%)

Average human performance on ARC-AGI-1 — the bar AI crossed in late 2024.

What it measures

History

By actor

Goal / baseline

Related reading