Why Does AI Struggle to Tell Time?

Artificial intelligence has revolutionized the digital world and changed how we view technology. Despite being capable of generating images, writing novels, completing homework assignments, and performing emotional analysis, AI often fails at telling time or determining it accurately.

In a study published on arXiv—an open-source archive for scientific articles—researchers from the University of Edinburgh tested the capabilities of 7 different types of Large Language Models (LLMs) to assess their ability to tell time.

Their test included various questions about images of different clocks and calendars. The study, which will be officially published next April, showed that these models struggle to understand and recognize these tasks that are considered fundamental in our daily lives.

Monochrome image of a large clock at a train station, conveying the passage of time.
Researchers say reading clocks and understanding calendars requires complex cognitive steps and precise visual discrimination (Pixabay)

The researchers wrote in the study: “The ability to interpret and infer time from visual inputs is crucial for many real-world applications, from scheduling events to autonomous systems. Despite advances in multimodal large language models, most research has focused on object detection, image labeling, and scene understanding, but hasn’t focused on temporal inference, which has left the time factor neglected in these system

The research team tested models from various companies including OpenAI’s ChatGPT-4o, Google’s Gemini, Anthropic’s Claude, Meta’s Llama, the Chinese model Qwen 2 from Alibaba, and MiniCPM from ModelBest. They presented images with different colors and shapes of regular wall clocks, clocks with Roman numerals, and clocks without second hands, as well as calendar images showing days and months for the past 10 year

clock test

In the clock test, researchers asked the large language models: “What time is shown in the attached image?” For the calendar test, they posed simple questions like “What day does New Year’s Day fall on?” and difficult questions such as “What is the 153rd day of the year?”

The researchers stated: “Reading clocks and understanding calendars requires complex cognitive steps, needing precise visual discrimination—to recognize the position of clock hands and calendar layout—as well as precise numerical thinking to calculate the number of days between two dates.”

A vintage round clock on a split pastel pink and blue background.
Aenean vel elit scelerisque mauris pellentesque.

Overall, AI models did not achieve satisfactory results. They correctly read the clock time in less than 25% of cases and struggled with clocks displaying Roman numerals or hands with innovative designs, just as much as they did with clocks lacking second hands. Here, researchers indicate that the problem may lie in detecting the hands and interpreting the angles on the clock face.

Notably, the Gemini model scored highest in the clock reading test, while ChatGPT-4o excelled in reading calendars and determining time with 80% accuracy. By contrast, most other large language models made errors in the calendar test at approximately 20%

Rohit Saxena, one of the study’s authors and a PhD student at the School of Informatics at the University of Edinburgh, said in a university statement: “Most people can tell time and use calendars from an early age, but our results show the significant gap in AI’s ability to perform what are considered very basic skills for humans. We should not overlook these problems if we want to integrate AI systems into time-sensitive real-world applications such as scheduling, automation, and assistive technology.”

He added, “Although AI can complete most of your homework, I wouldn’t recommend relying on it to meet any deadlines.”

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *