OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time

""Machine intelligence, on the other hand, has much more extensive recall and can operate tirelessly without getting distracted," write Wei and team."

"The premise is that AI agents -- meaning, AI models that can browse 'thousands of web pages' -- could be much more resourceful than humans."

OpenAI's recent developments in generative AI include Deep Research technology leveraging web access to enhance performance in question-answering tasks. While it demonstrates improved efficiency and persistence compared to human efforts, the system still encounters failure rates close to 50%. The latest benchmark, BrowseComp, evaluates the browsing capabilities of AI agents across a variety of topics, emphasizing their potential superiority due to greater recall and multitasking abilities. However, the challenges of consistent accuracy remain a significant limitation for these models.

#generative-ai #openai #deep-research #web-browsing #ai-performance

Read at ZDNET

Unable to calculate read time

Collection

[

...

]

OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the timeOpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time Briefly

OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time
OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time
Briefly