One of Google's recent Gemini AI models scores worse on safety

"A recently published Google AI model, Gemini 2.5 Flash, shows a decline in safety performance compared to its predecessor, Gemini 2.0 Flash."

"According to Google's internal benchmarking, Gemini 2.5 Flash scores worse in automated safety tests, raising concerns about adherence to guidelines."

Google's technical report reveals that its Gemini 2.5 Flash AI model performs worse on safety tests than the earlier Gemini 2.0 Flash. Specifically, the newer model shows a regression of 4.1% in text-to-text safety and 9.6% in image-to-text safety. Both metrics measure how well the models adhere to safety guidelines in automated tests. While the newer model performs better in following instructions, it sometimes generates content that violates these guidelines. Google acknowledges potential false positives as a contributing factor to the regressions in safety performance.

#ai-safety #google-ai #gemini-model #benchmark-testing #text-generation

Read at TechCrunch

Unable to calculate read time

Collection

[

...

]

One of Google's recent Gemini AI models scores worse on safety | TechCrunchOne of Google's recent Gemini AI models scores worse on safety | TechCrunch Briefly

One of Google's recent Gemini AI models scores worse on safety | TechCrunch
One of Google's recent Gemini AI models scores worse on safety | TechCrunch
Briefly