The team of researchers at Meta has created a benchmark called GAIA to test the abilities of AI assistants like OpenAI's GPT-4.
Human respondents were able to answer 92% of the GAIA questions correctly, while GPT-4 scored only 15% and GPT4 Turbo scored less than 10%.
This research suggests that we are still far from achieving artificial general intelligence (AGI), where AI algorithms can outperform humans.