OpenAI's GPT-5 has been released and demonstrated major issues, failing half of programming tests conducted. This marked the worst results ever for the flagship language model during these evaluations. The introduction of the Edit button in the code editor has been problematic, preventing users from returning to original sessions after modifications. Although GPT-5 provided usable code in some instances, its overall performance has raised concerns, especially when compared to previous models that typically achieved near-perfect results.
GPT-5 has failed half of my programming tests, marking the worst performance ever for OpenAI's flagship LLM on carefully designed tests.
The new Edit button in the code editor proved to be unhelpful, as revisiting my original session was not possible after saving changes.
Some AIs, like those from Microsoft and Google, improved over time, while GPT-5's performance on my coding prowess tests was unexpectedly poor.
Despite GPT-5's failures, it did generate a block of code that was runnable, indicating partial success in coding tasks.
Collection
[
|
...
]