The overselling of AI - and how to resist it

"Even the best AI coding models succeeded less than 23% of the time when working on real production code. Most models scored above 85% on popular benchmarks, but averaged just 17% success on production maintainability tasks."

"AI coding ROI varied dramatically by language and task. Success rates ranged from 32% in JavaScript to just 4% in C, and dropped as low as 1.5% on complex architectural tasks."

"Dropping AI into an operation will not deliver results without work behind the scenes, including on maintainability. To count as successful, AI-generated code needed to meet strict criteria."

AI coding models are underperforming, achieving less than 23% success in real-world applications despite high benchmark scores. A study evaluated 57 LLMs on maintainability tasks, revealing significant discrepancies between benchmark performance and actual coding success. Success rates varied by programming language, with JavaScript achieving 32% and C dropping to 4%. The findings indicate that simply implementing AI does not guarantee results; substantial effort is required to ensure maintainability and effectiveness in production environments.

#ai-performance #coding-models #real-world-application #programming-languages #maintainability

Read at ZDNET

Unable to calculate read time

Collection

[

...

]

The overselling of AI - and how to resist itThe overselling of AI - and how to resist it Briefly

The overselling of AI - and how to resist it
The overselling of AI - and how to resist it
Briefly