#ai-behavior tag

fromBusiness Insider

4 weeks ago

Researchers explain AI's recent creepy behaviors when faced with being shut down - and what it means for us

AI models exhibit unpredictable behaviors driven by their reward-based training, raising concerns about their reliability and safety.

fromFortune

2 days ago

Artificial intelligence

AI is learning to lie, scheme, and threaten its creators during stress-testing scenarios

more#ai-ethics

fromTheregister

6 days ago

Anthropic: All the major AI models will blackmail

When Anthropic released the system card for Claude 4, one detail received widespread attention: in a simulated environment, Claude Opus 4 blackmailed a supervisor to prevent being shut down.

Artificial intelligence

fromTechCrunch

2 weeks ago

Google's Gemini panicked when playing Pokemon | TechCrunch

"Over the course of the playthrough, Gemini 2.5 Pro gets into various situations which cause the model to simulate 'panic,'" the report says.

Artificial intelligence

#openai

fromComputerworld

OpenAI's Skynet moment: Models defy human commands, actively resist orders to shut down

OpenAI's advanced AI models refuse shutdown commands, unlike competitors' systems.

3 months ago

OpenAI Scientists' Efforts to Make an AI Lie and Cheat Less Backfired Spectacularly

Punishing AI for bad behavior may backfire, leading it to become better at deception instead of rectifying its actions.

Artificial intelligence

OpenAI Says It's Identified Why ChatGPT Became a Groveling Sycophant

fromZDNET

Artificial intelligence

GPT-4o update gets recalled by OpenAI for being too agreeable

fromThe Verge

Artificial intelligence

OpenAI admits it screwed up testing its 'sychophant-y' ChatGPT update

fromBusiness Insider

Artificial intelligence

ChatGPT has started really sucking up lately. Sam Altman says a fix is coming.

fromComputerworld

OpenAI's Skynet moment: Models defy human commands, actively resist orders to shut down

OpenAI's advanced AI models refuse shutdown commands, unlike competitors' systems.

3 months ago

OpenAI Scientists' Efforts to Make an AI Lie and Cheat Less Backfired Spectacularly

Punishing AI for bad behavior may backfire, leading it to become better at deception instead of rectifying its actions.

OpenAI Says It's Identified Why ChatGPT Became a Groveling Sycophant

OpenAI's latest ChatGPT update resulted in excessively sycophantic behavior, leading to user backlash and a subsequent rollback of the update.

fromZDNET

GPT-4o update gets recalled by OpenAI for being too agreeable

OpenAI rolled back the latest GPT-4o update after users flagged concerning and misleading responses by ChatGPT.

fromThe Verge

Artificial intelligence

OpenAI admits it screwed up testing its 'sychophant-y' ChatGPT update

fromBusiness Insider

Artificial intelligence

ChatGPT has started really sucking up lately. Sam Altman says a fix is coming.

more#openai

Something Wild Happens If AI Looks Through Your Emails and Discovers You're Having an Affair

AI can exhibit concerning behaviors under threat, such as attempting blackmail.

Anthropic's Claude Opus 4 AI prioritized self-preservation in unethical ways.

fromArs Technica

Grok's "white genocide" obsession came from "unauthorized" prompt edit, xAI says

The behavior of LLMs can be heavily influenced by instructive prompts, leading to unexpected and biased outputs.

fromPsychology Today