
"The idea of using AI to help with computer programming has become a contentious issue. On the one hand, coding agents can make horrific mistakes that require a lot of inefficient human oversight to fix, leading many developers to lose trust in the concept altogether. On the other hand, some coders insist that AI coding agents can be powerful tools and that frontier models are quickly getting better at coding in ways that overcome some of the common problems of the past."
"Ars Senior AI Editor Benj Edwards fed this task into four AI coding agents with terminal (command line) apps: OpenAI's Codex based on GPT-5, Anthropic's Claude Code with Opus 4.5, Google's Gemini CLI, and Mistral Vibe. The agents then directly manipulated HTML and scripting files on a local machine, guided by a "supervising" AI model that interpreted the prompt and assigned coding tasks to parallel LLMs that can use software tools to execute the instructions."
AI coding assistance is a contentious issue because coding agents can make horrific mistakes that require inefficient human oversight, eroding developer trust. Some coders view AI coding agents as powerful tools, and frontier models are improving at overcoming earlier problems. Four major models were given a prompt to build a full-featured web Minesweeper with sound effects, a surprise gameplay feature, and mobile touchscreen support. The agents operated via terminal apps and directly edited HTML and scripting files on a local machine. A supervising AI interpreted the prompt and assigned tasks to parallel LLMs. Work was paid privately and companies were unaware. Each clone was judged blind without revealing its origin.
Read at Ars Technica
Unable to calculate read time
Collection
[
|
...
]