You'll Laugh at This Simple Task AI Still Can't Do
Briefly

A study from the University of Edinburgh finds that multimodal large language models (MLLMs), like Google's Gemini and OpenAI's GPT-01, struggle significantly with reading analog clocks and yearly calendars. Despite being advanced in processing text and images, these AIs only achieved 25% accuracy in telling time on clock faces. They performed slightly better on dates, with GPT-01 scoring 80% correct answers on calendar-based questions. The challenge stems from the combination of spatial awareness, context, and math, which proves difficult for AI compared to human children who learn these skills around ages six or seven.
"Their findings show that AI systems, at best, got clock-hand positions right less than a quarter of the time. Mistakes were more common when clocks had Roman numerals or stylized clock hands."
"Researchers tested various clock designs, including some with Roman numerals, with and without second hands, and different dials, showing AI struggles with spatial awareness and math for time and dates."
"Although most people can tell the time and use calendars from an early age, our findings indicate that AI models still have significant gaps in these basic skills."
"GPT-01, the first generation of OpenAI's reasoning models, performed slightly better on calendars, scoring 80 percent on date questions but still made errors on basic queries about days and months."
Read at Futurism
[
|
]