Artificial intelligencefromTechCrunch5 days agoOne of Google's recent Gemini AI models scores worse on safety | TechCrunchGemini 2.5 Flash scores lower on safety tests compared to Gemini 2.0 Flash, raising concerns about AI safety compliance.
Artificial intelligencefromZDNET4 months agoOpenAI's o3 isn't AGI yet but it just did something no other AI has doneOpenAI's o3 model demonstrates significant adaptability, scoring 76% on the ARC-AGI benchmark, indicating a promising advance in AI capabilities.
fromTechCrunch2 weeks agoArtificial intelligenceOpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunchOpenAI's o3 model benchmark results are disputed, raising questions about transparency and testing practices.
Artificial intelligencefromZDNET4 months agoOpenAI's o3 isn't AGI yet but it just did something no other AI has doneOpenAI's o3 model demonstrates significant adaptability, scoring 76% on the ARC-AGI benchmark, indicating a promising advance in AI capabilities.
fromTechCrunch2 weeks agoArtificial intelligenceOpenAI's o3 AI model scores lower on a benchmark than the company initially implied | TechCrunchOpenAI's o3 model benchmark results are disputed, raising questions about transparency and testing practices.