Methodology

These public-facing capability pillars are plain-language summaries built on a deeper coverage map spanning reasoning, learning, truthfulness, self-monitoring, social competence, multimodal understanding, safety, and robustness. Each granular question is intended to be backable by benchmarks, controlled studies, audits, red-team exercises, longitudinal trials, or expert-blind review.

In progressHigh confidenceSelf-monitoring

AI knows when it may be wrong and can recover

An AI system can estimate uncertainty, abstain when needed, detect its own errors, and improve after feedback instead of bluffing through risk.

Progress60%

Updated Mar 12, 2026

Evidence items 5

Sub-questions 5