OpenAI's GPT-5 Is Here
Briefly

GPT-5 has shown to outpace its predecessors on several coding benchmarks, scoring 74.9% on SWE-Bench Verified and 88% on Aider Polyglot. In a demonstration, it successfully created a web app for learning French in response to specific requests. Michelle Pokrass highlighted its effectiveness in executing complex tasks and following detailed instructions. Additionally, it outperformed previous models in health-related assessments, marking significant improvements, and is noted to hallucinate less, enhancing its reliability.
GPT-5 scores 74.9% on SWE-Bench Verified, 55% on SWE-Lancer, and 88% on Aider Polyglot, showcasing significant improvements in coding benchmarks.
Yann Dubois tasked GPT-5 to create a web app for learning French, which resulted in a sleek site fulfilling all requested features.
Michelle Pokrass states that GPT-5 excels in agentic tasks, effectively executing long chains and tool calls while providing explanations of its actions.
GPT-5 is reported to be the best model for health-related questions, achieving significantly improved scores on multiple health benchmarks.
Read at WIRED
[
|
]