Government Test Finds That AI Wildly Underperforms Compared to Human Employees
Briefly

The trial revealed generative AI's summary capabilities are significantly inferior to human capabilities, with scores of 47% for AI versus 81% for human summaries.
Using Meta's Llama2-70B model, the AI struggled to meet the expectations set for summarizing documents, casting doubt on its practical applications in business.
The findings highlight a prevalent concern about generative AI's reliability, raising questions about its utility for most organizations in workplace settings.
Evaluators in the trial reported a strong perception of the AI outputs, confirming the challenges in distinguishing between human and AI-generated summaries.
Read at Futurism
[
]
[
|
]