OpenAI o3 Crushes AI Benchmarks – Still Not AGI

23 Dec, 2024

OpenAI has introduced o3, a large language model that significantly outperforms previous LLMs on reasoning tasks, achieving record-breaking scores on the ARC-AGI-1 and Frontier Math benchmarks. While not considered AGI, the model marks a major leap in AI versatility and problem-solving.

Maria Deutscher writes for SiliconANGLE

OpenAI today detailed o3, its new flagship large language model for reasoning tasks. The model’s introduction caps off a 12-day product announcement series that started with the launch of a new ChatGPT plan. Compared with earlier LLMs, o3 demonstrated significant improvements across benchmarks, including ARC-AGI-1, which tests AI on tasks it wasn't specifically trained for.

Currently, o3 is available to select researchers as OpenAI refines its safety mechanisms before broader release.

#AGI #AI #OpenAI