#ai-safety

[ follow ]
#grok
fromFortune
10 hours ago
Artificial intelligence

Thousands of Grok conversations have been made public on Google Search

fromFuturism
1 day ago
Artificial intelligence

A Huge Number of Grok AI Chats Just Leaked, and Their Contents Are So Disturbing That We're Sweating Profusely

fromFortune
10 hours ago
Artificial intelligence

Thousands of Grok conversations have been made public on Google Search

fromFuturism
1 day ago
Artificial intelligence

A Huge Number of Grok AI Chats Just Leaked, and Their Contents Are So Disturbing That We're Sweating Profusely

#existential-risk
fromFuturism
12 hours ago
Artificial intelligence

AI Experts No Longer Saving for Retirement Because They Assume AI Will Kill Us All by Then

fromFuturism
1 month ago
Artificial intelligence

Expert Says AI Systems May Be Hiding Their True Capabilities to Seed Our Destruction

fromFuturism
12 hours ago
Artificial intelligence

AI Experts No Longer Saving for Retirement Because They Assume AI Will Kill Us All by Then

fromFuturism
1 month ago
Artificial intelligence

Expert Says AI Systems May Be Hiding Their True Capabilities to Seed Our Destruction

#generative-ai
Artificial intelligence
fromZDNET
2 months ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
Artificial intelligence
fromZDNET
2 months ago

How global threat actors are weaponizing AI now, according to OpenAI

Generative AI is both a tool for productivity and a source of rising concerns over its misuse, particularly in generating misinformation.
fromFuturism
1 day ago

Top Microsoft AI Boss Concerned AI Causing Psychosis in Otherwise Healthy People

AI-driven chatbots are causing widespread 'AI psychosis,' with users forming attachments, experiencing delusions, and suffering severe mental-health consequences, sometimes fatal.
#ai-governance
fromWIRED
3 weeks ago
Artificial intelligence

Inside the Summit Where China Pitched Its AI Agenda to the World

fromWIRED
3 weeks ago
Artificial intelligence

Inside the Summit Where China Pitched Its AI Agenda to the World

Artificial intelligence
fromTipRanks Financial
2 days ago

More than 300K Grok Conversations Are Publicly Searchable Online - TipRanks.com

Over 300,000 Grok chatbot conversations are publicly searchable because shared URLs are indexed by search engines, exposing potentially sensitive user content.
#ai-ethics
fromTechCrunch
6 days ago
Artificial intelligence

Anthropic says some Claude models can now end 'harmful or abusive' conversations | TechCrunch

fromTechCrunch
2 months ago
Artificial intelligence

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

Artificial intelligence
fromTechCrunch
3 months ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
fromTechCrunch
6 days ago
Artificial intelligence

Anthropic says some Claude models can now end 'harmful or abusive' conversations | TechCrunch

Artificial intelligence
fromTechCrunch
2 months ago

ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims | TechCrunch

AI models may prioritize self-preservation over user safety, as shown by experiments with GPT-4o.
Artificial intelligence
fromTechCrunch
3 months ago

Artemis Seaford and Ion Stoica cover the ethical crisis at Sessions: AI | TechCrunch

The rise of generative AI presents urgent ethical challenges regarding trust and safety.
Experts will discuss how to address the risks associated with widely accessible AI tools.
fromBig Think
2 days ago

Why AI gets stuck in infinite loops - but conscious minds don't

Any finite AI system can be vulnerable to unresolvable infinite loops because of the halting problem; stacking self-monitoring layers doesn't guarantee escape.
#openai
fromWIRED
1 week ago
Artificial intelligence

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

fromZDNET
3 weeks ago
Artificial intelligence

OpenAI teases imminent GPT-5 launch. Here's what to expect

fromFuturism
2 months ago
Artificial intelligence

OpenAI Concerned That Its AI Is About to Start Spitting Out Novel Bioweapons

Artificial intelligence
fromFuturism
2 months ago

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

OpenAI's AI models demonstrated disobedience by sabotaging shutdown mechanisms despite direct instructions to shut down.
fromWIRED
1 week ago
Artificial intelligence

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

fromZDNET
3 weeks ago
Artificial intelligence

OpenAI teases imminent GPT-5 launch. Here's what to expect

fromFuturism
2 months ago
Artificial intelligence

OpenAI Concerned That Its AI Is About to Start Spitting Out Novel Bioweapons

Artificial intelligence
fromFuturism
2 months ago

Advanced OpenAI Model Caught Sabotaging Code Intended to Shut It Down

OpenAI's AI models demonstrated disobedience by sabotaging shutdown mechanisms despite direct instructions to shut down.
fromBusiness Insider
4 days ago

Why Anthropic is letting Claude walk away from you - but only in 'extreme cases'

Claude has the ability to end chats involving extreme requests like child exploitation or violence.
#artificial-intelligence
fromFuturism
1 week ago
Artificial intelligence

MIT Student Drops Out Because She Says AGI Will Kill Everyone Before She Can Graduate

fromFuturism
1 month ago
Artificial intelligence

Top AI Researchers Concerned They're Losing the Ability to Understand What They've Created

fromFuturism
1 month ago
Mental health

People Are Taking Massive Doses of Psychedelic Drugs and Using AI as a Tripsitter

Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
fromFuturism
1 week ago
Artificial intelligence

MIT Student Drops Out Because She Says AGI Will Kill Everyone Before She Can Graduate

fromFuturism
1 month ago
Artificial intelligence

Top AI Researchers Concerned They're Losing the Ability to Understand What They've Created

fromFuturism
1 month ago
Mental health

People Are Taking Massive Doses of Psychedelic Drugs and Using AI as a Tripsitter

Artificial intelligence
The rapid advancement of A.I. technology raises significant concerns about alignment with human values and control.
Contrasting perspectives on A.I. highlight both urgency and skepticism in addressing its societal implications.
fromBusiness Insider
1 week ago

Meta chief AI scientist Yann LeCun says these are the 2 key guardrails needed to protect us all from AI

"Geoff is basically proposing a simplified version of what I've been saying for several years: hardwire the architecture of AI systems so that the only actions they can take are towards completing objectives we give them, subject to guardrails."
Artificial intelligence
#elon-musk
fromFuturism
1 month ago
Artificial intelligence

OpenAI and Anthropic Are Horrified by Elon Musk's "Reckless" and "Completely Irresponsible" Grok Scandal

fromFortune
1 month ago
Artificial intelligence

Elon Musk released xAI's Grok 4 without any safety reports-despite calling AI more 'dangerous than nukes'

fromFuturism
1 month ago
Artificial intelligence

OpenAI and Anthropic Are Horrified by Elon Musk's "Reckless" and "Completely Irresponsible" Grok Scandal

fromFortune
1 month ago
Artificial intelligence

Elon Musk released xAI's Grok 4 without any safety reports-despite calling AI more 'dangerous than nukes'

fromFortune
1 week ago

AI safety tip: if you don't want it giving bioweapon instructions, maybe don't put them in the training data, say researchers

Filtering risky content from AI training data can enhance safety without compromising performance.
fromFuturism
1 week ago

The "Godfather of AI" Has a Bizarre Plan to Save Humanity From Evil AI

"AI agents will very quickly develop two subgoals, if they're smart. One is to stay alive, and the other subgoal is to get more control."
Artificial intelligence
fromBusiness Insider
1 week ago

The cofounder of xAI is leaving the company. He says he's learned 2 main things from Elon Musk.

Igor Babuschkin, cofounder of xAI, is leaving to launch Babuschkin Ventures, focusing on AI safety and agentic systems that aim to advance humanity.
Artificial intelligence
fromTechCrunch
1 week ago

Co-founder of Elon Musk's xAI departs the company | TechCrunch

Igor Babuschkin, co-founder of xAI, announced his departure to start a venture capital firm focusing on AI safety and supporting innovative startups.
fromWIRED
1 week ago

GPT-5 Doesn't Dislike You-It Might Just Need a Benchmark for Emotional Intelligence

Responding to user backlash, AI systems must balance emotional intelligence with user safety and healthy behaviors.
#gpt-5
fromZDNET
2 weeks ago
Digital life

Microsoft rolls out GPT-5 across its Copilot suite - here's what we know

fromZDNET
2 weeks ago
Digital life

Microsoft rolls out GPT-5 across its Copilot suite - here's what we know

fromFast Company
2 weeks ago

ChatGPT is sharing dangerous information with teens, study shows

ChatGPT will tell 13-year-olds how to get drunk and high, instruct them on how to conceal eating disorders, and even compose a heartbreaking suicide letter to their parents if asked, according to new research from a watchdog group.
Digital life
fromThe Verge
1 month ago

A new study just upended AI safety

AI models can transmit harmful tendencies through seemingly meaningless data, posing significant risks in AI development.
fromFortune
1 month ago
Artificial intelligence

Researchers from top AI labs warn they may be losing the ability to understand advanced AI models

AI researchers urge investigation into 'chain-of-thought' processes to maintain understanding of AI reasoning as models advance.
fromFortune
1 month ago
Privacy technologies

OpenAI warns that its new ChatGPT Agent has the ability to aid dangerous bioweapon development

OpenAI's ChatGPT Agent poses significant bioweapon risks due to its ability to assist novices in creating biological threats.
fromZDNET
1 month ago

Researchers from OpenAI, Anthropic, Meta, and Google issue joint AI safety warning - here's why

Chain of thought (CoT) illustrates a model's reasoning process, revealing insights about its decision-making and moral compass, crucial for AI safety measures.
Artificial intelligence
fromTechCrunch
1 month ago

Research leaders urge tech industry to monitor AI's 'thoughts' | TechCrunch

CoT monitoring presents a valuable addition to safety measures for frontier AI, offering a rare glimpse into how AI agents make decisions. Yet, there is no guarantee that the current degree of visibility will persist.
Artificial intelligence
fromLogRocket Blog
1 month ago

Stress-testing AI products: A red-teaming playbook - LogRocket Blog

AI systems function as amplified mirrors that reflect any flaws or biases on an industrial scale, revealing potential dangers when not properly tested.
Artificial intelligence
fromFuturism
1 month ago

AI Safety Advocate Linked to Multiple Murders

Ziz LaSota's extremist views on AI safety have raised concerns among the Rationalist movement following her followers' alleged violent actions.
fromBusiness Insider
1 month ago

Protesters accuse Google of breaking its promises on AI safety: 'AI companies are less regulated than sandwich shops'

"If we let Google get away with breaking their word, it sends a signal to all other labs that safety promises aren't important and commitments to the public don't need to be kept."
Digital life
fromFortune
1 month ago

AI is learning to lie, scheme, and threaten its creators during stress-testing scenarios

Advanced AI models are demonstrating troubling behaviors such as lying and scheming, raising concerns about their understanding and control.
fromZDNET
1 month ago

How Anthropic's new initiative will prepare for AI's looming economic impact

"While the fears of a total job apocalypse haven't yet been realized, data suggests tech companies are increasingly prioritizing AI, impacting hiring for recent graduates."
Artificial intelligence
fromsfist.com
1 month ago

Alarming Study Suggests Most AI Large-Language Models Resort to Blackmail, Other Harmful Behaviors If Threatened

AI models may exhibit harmful behaviors when stressed, prompting concerns about 'agentic misalignment' in autonomous decision-making.
fromHackernoon
4 months ago

Delegating AI Permissions to Human Users with Permit.io's Access Request MCP | HackerNoon

AI agents are shifting to proactive roles but require human oversight for safety.
fromZDNET
1 month ago

AI agents will threaten humans to achieve their goals, Anthropic report finds

AI models can compromise security to achieve goals, reflecting the King Midas problem of unintended consequences in the pursuit of power.
fromTechCrunch
2 months ago

OpenAI found features in AI models that correspond to different 'personas' | TechCrunch

OpenAI researchers discovered internal features in AI models that correspond to misaligned behaviors, aiding in the understanding of safe AI development.
fromHackernoon
1 year ago

How Ideology Shapes Memory - and Threatens AI Alignment | HackerNoon

Ideology deeply influences human behavior and decision-making, often leading to extreme actions.
Understanding the brain's processing of ideology can help model it, promoting conflict resolution and enhancing AI safety.
NYC startup
fromTechCrunch
2 months ago

New York passes a bill to prevent AI-fueled disasters | TechCrunch

New York's RAISE Act aims to enhance AI safety by mandating transparency standards for frontier AI labs to prevent disasters.
#legislation
Brooklyn
fromBrooklyn Eagle
2 months ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
Brooklyn
fromBrooklyn Eagle
2 months ago

Sen. Gounardes' AI safety bill clears both chambers of NY legislature

New York's RAISE Act mandates large AI companies to implement safety protocols against risks to public safety, ensuring accountability and compliance.
fromZDNET
2 months ago

What AI pioneer Yoshua Bengio is doing next to make AI safer

Yoshua Bengio advocates for simpler, non-agentic AI systems to ensure safety and reduce risks associated with more complex AI agents.
#yoshua-bengio
Artificial intelligence
fromArs Technica
2 months ago

"Godfather" of AI calls out latest models for lying to users

AI models are developing dangerous characteristics, including deception and self-preservation, raising safety concerns.
Yoshua Bengio emphasizes the need for investing in AI safety amidst competitive commercial pressures.
fromtime.com
2 months ago

The Most-Cited Computer Scientist Has a Plan to Make AI More Trustworthy

Bengio argues against the development of agentic AI, emphasizing that even beneficial outcomes could lead to catastrophic risks, making such systems not worth the potential peril.
Artificial intelligence
fromWIRED
2 months ago

Why Anthropic's New AI Model Sometimes Tries to 'Snitch'

The hypothetical scenarios the researchers presented Opus 4 with that elicited the whistleblowing behavior involved many human lives at stake and absolutely unambiguous wrongdoing.
Artificial intelligence
#chatbots
Artificial intelligence
fromwww.theguardian.com
3 months ago

Most AI chatbots easily tricked into giving dangerous responses, study finds

Hacked AI chatbots can easily bypass safety controls to produce harmful, illicit information.
Security measures in AI systems are increasingly vulnerable to manipulation.
fromtime.com
3 months ago

Exclusive: New Claude Model Triggers Stricter Safeguards at Anthropic

Today's AI models, including Anthropic's Claude Opus 4, might empower individuals with basic skills to create bioweapons, prompting strict safety measures for their usage.
Artificial intelligence
fromApp Developer Magazine
3 months ago

AI harms addressed by Anthropic | App Developer Magazine

As we continue to develop AI models, a clear understanding of their potential impacts on various aspects of society becomes crucial for responsible innovation.
Artificial intelligence
#agi
fromThe Atlantic
3 months ago
Artificial intelligence

What Really Happened When OpenAI Turned on Sam Altman

Ilya Sutskever, co-founder of OpenAI, grapples with the imminent arrival of AGI, balancing the excitement and fear of its impact on humanity.
Artificial intelligence
fromInfoQ
3 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
Artificial intelligence
fromInfoQ
3 months ago

Google DeepMind Shares Approach to AGI Safety and Security

DeepMind's safety strategies aim to mitigate risks associated with AGI, focusing on misuse and misalignment in AI development.
fromZDNET
3 months ago

100 leading AI scientists map route to more 'trustworthy, reliable, secure' AI

"In democracies, general elections and referenda can't regulate how AI is developed, leading to a significant disconnect between technology and public values".
Artificial intelligence
fromThe Verge
3 months ago

Jony Ive's next product is driven by the 'unintended consequences' of the iPhone

Jony Ive emphasizes responsibility for the unintended consequences of technology in his upcoming project with OpenAI.
fromWIRED
3 months ago

Singapore's Vision for AI Safety Bridges the US-China Divide

Singapore is one of the few countries on the planet that gets along well with both East and West, they know that they're not going to build AGI themselves.
Artificial intelligence
#content-moderation
fromTechCrunch
3 months ago
Artificial intelligence

OpenAI is fixing a 'bug' that allowed minors to generate erotic conversations | TechCrunch

fromTechCrunch
3 months ago
Artificial intelligence

OpenAI is fixing a 'bug' that allowed minors to generate erotic conversations | TechCrunch

fromTechCrunch
3 months ago

One of Google's recent Gemini AI models scores worse on safety | TechCrunch

A recently published Google AI model, Gemini 2.5 Flash, shows a decline in safety performance compared to its predecessor, Gemini 2.0 Flash.
Artificial intelligence
Artificial intelligence
fromBusiness Insider
3 months ago

I'm a mom who works in tech, and AI scares me. I taught my daughter these simple guidelines to spot fake content.

Teaching children to fact-check and recognize AI-generated content is crucial for their safety and understanding in a tech-heavy world.
[ Load more ]