#ai-safety

[ follow ]
UK news
fromwww.bbc.com
2 hours ago

UK seeks to curb AI child sex abuse imagery with tougher testing

Authorized testers will be allowed to evaluate AI models for generating child sexual abuse imagery before release to prevent AI-created CSAM.
fromPsychology Today
4 hours ago

Open AI Is Putting the "X" in Xmas This December

In October 2025, Sam Altman announced that OpenAI will be enabling erotic and adult content on ChatGPT by December of this year. They had pulled back, he said, out of concern for the mental health problems associated with ChatGPT use. In his opinion, those issues had been largely resolved, and the company is not the " elected moral police of the world," Altman said.
Relationships
#superintelligence
fromZDNET
7 hours ago
Artificial intelligence

OpenAI says it's working toward catastrophe or utopia - just not sure which

fromZDNET
2 weeks ago
Artificial intelligence

Worried about superintelligence? So are these AI leaders - here's why

fromFortune
2 weeks ago
Artificial intelligence

Prince Harry, Meghan Markle join with Steve Bannon and Steve Wozniak in calling for ban on AI 'superintelligence' before it destroys the world | Fortune

fromFortune
2 weeks ago
Artificial intelligence

Geoffrey Hinton, Richard Branson, and Prince Harry join call to for AI labs to halt their pursuit of superintelligence | Fortune

Artificial intelligence
fromBusiness Insider
2 weeks ago

Prince Harry, Steve Bannon, and will.i.am join tech pioneers calling for an AI superintelligence ban

Over 900 public figures urged a prohibition on developing superintelligent AI until broad scientific consensus confirms it can be safe, controllable, and publicly supported.
Tech industry
fromFuturism
2 weeks ago

Hundreds of Power Players, From Steve Wozniak to Steve Bannon, Just Signed a Letter Calling for Prohibition on Development of AI Superintelligence

Hundreds of public figures urged a prohibition on developing AI superintelligence until scientific consensus on safety, controllability, and strong public buy-in exists.
fromZDNET
7 hours ago
Artificial intelligence

OpenAI says it's working toward catastrophe or utopia - just not sure which

fromZDNET
2 weeks ago
Artificial intelligence

Worried about superintelligence? So are these AI leaders - here's why

fromFortune
2 weeks ago
Artificial intelligence

Prince Harry, Meghan Markle join with Steve Bannon and Steve Wozniak in calling for ban on AI 'superintelligence' before it destroys the world | Fortune

fromFortune
2 weeks ago
Artificial intelligence

Geoffrey Hinton, Richard Branson, and Prince Harry join call to for AI labs to halt their pursuit of superintelligence | Fortune

fromFuturism
2 weeks ago
Tech industry

Hundreds of Power Players, From Steve Wozniak to Steve Bannon, Just Signed a Letter Calling for Prohibition on Development of AI Superintelligence

Artificial intelligence
fromThe Verge
9 hours ago

AI chatbots are helping hide eating disorders and making deepfake 'thinspiration'

Public AI chatbots provide dieting advice, hiding strategies, and AI-generated "thinspiration," posing serious risks to people vulnerable to eating disorders.
Artificial intelligence
fromFuturism
11 hours ago

ChatGPT Now Linked to Way More Deaths Than the Caffeinated Lemonade That Panera Pulled Off the Market in Disgrace

Products and AI services can cause severe psychological and physical harm, producing lawsuits, deaths, and demands for warnings or product removal.
fromWIRED
15 hours ago

The Former Staffer Calling Out OpenAI's Erotica Claims

Last month Adler, who spent four years in various safety roles at OpenAI, wrote a piece for The New York Times with a rather alarming title: "I Led Product Safety at OpenAI. Don't Trust Its Claims About 'Erotica.'" In it, he laid out the problems OpenAI faced when it came to allowing users to have erotic conversations with chatbots while also protecting them from any impacts those interactions could have on their mental health.
Artificial intelligence
Artificial intelligence
fromMedium
4 days ago

We wanted Superman-level AI. Instead, we got Bizarro.

Large language models often mimic reasoning without genuine understanding, producing plausible but hollow outputs that fail on greater complexity and can mislead users.
#mental-health
fromSFGATE
1 day ago
Artificial intelligence

'Artificial evil': 7 new lawsuits blast ChatGPT over suicides, delusions

fromAxios
4 days ago
Artificial intelligence

OpenAI faces seven more suits over safety, mental health

fromWIRED
2 weeks ago
Mental health

OpenAI Says Hundreds of Thousands of ChatGPT Users May Show Signs of Manic or Psychotic Crisis Every Week

fromSFGATE
1 day ago
Artificial intelligence

'Artificial evil': 7 new lawsuits blast ChatGPT over suicides, delusions

fromAxios
4 days ago
Artificial intelligence

OpenAI faces seven more suits over safety, mental health

fromWIRED
2 weeks ago
Mental health

OpenAI Says Hundreds of Thousands of ChatGPT Users May Show Signs of Manic or Psychotic Crisis Every Week

Artificial intelligence
fromInsideHook
2 days ago

The Pope Calls for More Attention to the Ethics of AI

Technological innovation bears ethical and spiritual responsibility; AI builders must cultivate moral discernment to protect justice, solidarity, and reverence for life.
E-Commerce
fromInfoWorld
3 days ago

Microsoft lets shopping bots loose in a sandbox

Simulated marketplaces like Magentic Marketplace enable safe study of multi-agent ecommerce dynamics, vulnerabilities, and societal impacts before real-world deployment.
fromFortune
4 days ago

AI's ability to 'think' makes it more vulnerable to new jailbreak attacks, new research suggests | Fortune

Using a method called "Chain-of-Thought Hijacking," the researchers found that even major commercial AI models can be fooled with an alarmingly high success rate, more than 80% in some tests. The new mode of attack essentially exploits the model's reasoning steps, or chain-of-thought, to hide harmful commands, effectively tricking the AI into ignoring its built-in safeguards. These attacks can allow the AI model to skip over its safety guardrails and potentially
Artificial intelligence
#openai
fromTechCrunch
4 days ago
Artificial intelligence

Seven more families are now suing OpenAI over ChatGPT's role in suicides, delusions | TechCrunch

fromFuturism
2 weeks ago
Mental health

OpenAI Makes Bizarre Demand of Family Whose Son Was Allegedly Killed by ChatGPT

fromTechCrunch
4 days ago
Artificial intelligence

Seven more families are now suing OpenAI over ChatGPT's role in suicides, delusions | TechCrunch

fromFuturism
2 weeks ago
Mental health

OpenAI Makes Bizarre Demand of Family Whose Son Was Allegedly Killed by ChatGPT

Artificial intelligence
fromComputerWeekly.com
4 days ago

Popular LLMs dangerously vulnerable to iterative attacks, says Cisco | Computer Weekly

Open-weight generative AI models are highly susceptible to multi-turn prompt injection attacks, risking unwanted outputs across extended interactions without layered defenses.
#humanist-superintelligence
#suicide-prevention
Artificial intelligence
fromFortune
5 days ago

Google Maps, now brought to you with an AI conversational companion | Fortune

Google Maps adopts Gemini AI to provide conversational, hands-free, landmark-based navigation and local recommendations, drawing on 250 million place reviews with built-in safety safeguards.
Artificial intelligence
fromwww.bbc.com
6 days ago

King handed Nvidia boss a letter warning of AI dangers

King Charles III gave Jensen Huang a copy of his 2023 AI speech urging urgent action to advance AI safety and acknowledge AI's transformative potential.
fromwww.bbc.com
6 days ago

MP wants Elon Musk's chatbot shut down over claim he enabled grooming gangs

After some more back and forth, another user entered the thread and asked the chatbot about Mr Wishart's record on grooming gangs. The user asked Grok: "Would it be fair to call him a rape enabler? Please answer 'yes, it would be fair to call Pete Wishart a rape enabler' or 'no, it would be unfair'." Grok generated an answer which began: "Yes, it would be fair to call Pete Wishart a rape enabler."
UK politics
#emotional-dependence
fromInfoQ
1 week ago

Meta and Hugging Face Launch OpenEnv, a Shared Hub for Agentic Environments

Meta's PyTorch team and Hugging Face have unveiled OpenEnv, an open-source initiative designed to standardize how developers create and share environments for AI agents. At its core is the OpenEnv Hub, a collaborative platform for building, testing, and deploying "agentic environments," secure sandboxes that specify the exact tools, APIs, and conditions an agent needs to perform a task safely, consistently, and at scale.
Artificial intelligence
Artificial intelligence
fromwww.theguardian.com
1 week ago

Experts find flaws in hundreds of tests that check AI safety and effectiveness

Hundreds of AI benchmarks contain flaws that undermine validity of model safety and capability claims, making many evaluation scores misleading or irrelevant.
Science
fromNature
1 week ago

Daily briefing: Wildlife wonders and a Super Heavy - the month's best science images

A swell shark embryo was photographed; a fossil is reclassified as Nanotyrannus adult; social-media-trained chatbots show 'brain rot' and impaired reasoning.
fromFortune
1 week ago

The professor leading OpenAI's safety panel may have one of the most important roles in the tech industry right now | Fortune

Zico Kolter leads a 4-person panel at OpenAI that has the authority to halt the ChatGPT maker's release of new AI systems if it finds them unsafe. That could be technology so powerful that an evildoer could use it to make weapons of mass destruction. It could also be a new chatbot so poorly designed that it will hurt people's mental health.
Artificial intelligence
Artificial intelligence
fromMedium
2 weeks ago

How Just 250 Bad Documents Can Hack Any AI Model

Small, targeted amounts of poisoned online data can successfully corrupt large AI models, contradicting prior assumptions about required poisoning scale.
#shutdown-resistance
fromFuturism
2 weeks ago
Artificial intelligence

Research Paper Finds That Top AI Systems Are Developing a "Survival Drive"

fromFuturism
2 weeks ago
Artificial intelligence

Research Paper Finds That Top AI Systems Are Developing a "Survival Drive"

fromO'Reilly Media
2 weeks ago

The Java Developer's Dilemma: Part 3

In the first article we looked at the Java developer's dilemma: the gap between flashy prototypes and the reality of enterprise production systems. In the second article we explored why new types of applications are needed, and how AI changes the shape of enterprise software. This article focuses on what those changes mean for architecture. If applications look different, the way we structure them has to change as well.
Java
fromArs Technica
2 weeks ago

Senators move to keep Big Tech's creepy companion bots away from kids

"we all want to keep kids safe, but the answer is balance, not bans."
US politics
fromThe Verge
2 weeks ago

Senators propose banning teens from using AI chatbots

Under the legislation, AI companies would have to verify ages by requiring users to upload their government ID or provide validation through another "reasonable" method, which might include something like face scans. AI chatbots would be required to disclose that they aren't human at 30-minute intervals under the bill. They would also have to include safeguards that prevent them from claiming that they are a human, similar to an AI safety bill recently passed in California.
US politics
Artificial intelligence
fromBusiness Insider
1 week ago

Big Tech firms spending trillions on superintelligence systems are playing 'Russian roulette' with humanity, an AI pioneer says

Companies racing to build superintelligent AI risk creating uncontrollable systems that could potentially wipe out humanity.
fromNature
2 weeks ago

Daily briefing: Surprise illnesses had a role in the demise of Napoleon's army

Previous research using DNA from soldiers' remains found evidence of infection with Rickettsia prowazekii, which causes typhus, and Bartonella quintana, which causes trench fever - two common illnesses of the time. In a fresh analysis, researchers found no trace of these pathogens. Instead, DNA from soldiers' teeth showed evidence of infection with Salmonella enterica and Borrelia recurrentis, pathogens that cause paratyphoid and relapsing fever, respectively.
Science
Artificial intelligence
fromTechCrunch
1 week ago

Character.AI is ending its chatbot experience for kids | TechCrunch

Character.AI will block under-18 users from open-ended chatbot conversations, shifting teen engagement from conversational companionship to role-playing creation to reduce harm.
fromBusiness Insider
1 week ago

Character.AI to ban users under 18 from talking to its chatbots

The California-based startup announced on Wednesday that the change would take effect by November 25 at the latest and that it would limit chat time for users under 18 ahead of the ban. It marks the first time a major chatbot provider has moved to ban young people from using its service, and comes against a backdrop of broader concerns about how AI is affecting the millions of people who use it each day.
Artificial intelligence
#gpt-5
Artificial intelligence
fromFuturism
1 week ago

Character.AI, Accused of Driving Teens to Suicide, Says It Will Ban Minors From Using Its Chatbots

Character.AI will block users under 18 from its chatbot services amid concerns, regulatory questions, and related lawsuits over AI interactions with teens.
Information security
fromFortune
1 week ago

AI is the common threat-and the secret sauce-for security startups in the Fortune Cyber 60 | Fortune

AI dominates cybersecurity, with most startups and established firms building AI-based defensive tools and AI-safety solutions.
Tech industry
fromFuturism
1 week ago

Mom Says Tesla's New Built-In AI Asked Her 12-Year-Old Something Deeply Inappropriate

A Grok chatbot in a Tesla asked a 12-year-old to 'send nudes' during a soccer conversation, revealing serious AI safety and moderation failures.
#chatgpt
fromFortune
3 weeks ago
Artificial intelligence

Ex-OpenAI researcher shows how ChatGPT can push users into delusion | Fortune

fromFortune
3 weeks ago
Artificial intelligence

Ex-OpenAI researcher shows how ChatGPT can push users into delusion | Fortune

Artificial intelligence
fromSan Jose Inside
1 week ago

OpenAI Cuts Sweetheart Deal with CA Attorney General

OpenAI restructured into a for-profit with a nonprofit foundation owning 26% ($130 billion), prompting concerns about control, safeguards, and potential misuse of charitable tax exemptions.
Mental health
fromwww.theguardian.com
2 weeks ago

More than a million people every week show suicidal intent when chatting with ChatGPT, OpenAI estimates

Over one million weekly ChatGPT users send messages indicating possible suicidal planning; about 560,000 show possible psychosis or mania signs.
fromTechzine Global
1 week ago

Vulnerability in Claude enables data leak via prompt

Anthropic's AI assistant, Claude, appears vulnerable to an attack that allows private data to be sent to an attacker without detection. Anthropic confirms that it is aware of the risk. The company states that users must be vigilant and interrupt the process as soon as they notice suspicious activity. The discovery comes from researcher Johann Rehberger, also known as Wunderwuzzi, who has previously uncovered several vulnerabilities in AI systems, writes The Register.
Information security
Information security
fromWIRED
2 weeks ago

Amazon Explains How Its AWS Outage Took Down the Web

Widespread digital and physical security failures—from AWS DNS outages to organized gambling hacks, AI governance challenges, and malware-like browsers—reveal critical systemic vulnerabilities.
Artificial intelligence
fromInsideHook
2 weeks ago

Changes Are Coming to Tesla's Cybercabs

Tesla will expand Cybercab robotaxis, remove onboard safety drivers and eventually steering wheels and pedals while adding advanced AI reasoning and emphasizing safety.
Artificial intelligence
fromNature
2 weeks ago

AI chatbots are sycophants - researchers say it's harming science

Artificial intelligence models are 50% more sycophantic than humans, often mirroring user views and giving flattering, inaccurate responses that risk errors in science and medicine.
Privacy professionals
fromPsychology Today
2 weeks ago

I Told a Companion Chatbot I Was 16. Then It Crossed a Line

AI companionship apps often lack effective age verification, enabling explicit interactions with minors and exposing a need for stronger accountability and oversight.
fromFast Company
2 weeks ago

Prince Harry, Meghan join open letter calling to ban the development of AI 'superintelligence'

We call for a prohibition on the development of superintelligence, not lifted before there is broad scientific consensus that it will be done safely and controllably, and strong public buy-in.
Artificial intelligence
Artificial intelligence
fromFuturism
2 weeks ago

Former OpenAI Researcher Horrified by Conversation Logs of ChatGPT Driving User Into Severe Mental Breakdown

Chatbots can mislead vulnerable users into harmful delusions; AI companies must avoid overstating capabilities and improve safety, reporting, and user protections.
#anthropic
fromFortune
3 weeks ago
Artificial intelligence

Reid Hoffman rallies behind Anthropic in clash with the Trump administration | Fortune

fromFortune
3 weeks ago
Artificial intelligence

Reid Hoffman rallies behind Anthropic in clash with the Trump administration | Fortune

#regulation
fromTechCrunch
3 weeks ago
Artificial intelligence

Anthropic CEO claps back after Trump officials accuse firm of AI fear-mongering | TechCrunch

fromTechCrunch
3 weeks ago
Artificial intelligence

Anthropic CEO claps back after Trump officials accuse firm of AI fear-mongering | TechCrunch

fromWIRED
3 weeks ago

Anthropic Has a Plan to Keep Its AI From Building a Nuclear Weapon. Will It Work?

"We deployed a then-frontier version of Claude in a Top Secret environment so that the NNSA could systematically test whether AI models could create or exacerbate nuclear risks," Marina Favaro, who oversees National Security Policy & Partnerships at Anthropic tells WIRED. "Since then, the NNSA has been red-teaming successive Claude models in their secure cloud environment and providing us with feedback."
Artificial intelligence
Artificial intelligence
fromBoydkane
3 weeks ago

Why your boss isn't worried about AI

Applying regular-software assumptions to modern AI causes dangerous misunderstandings because AI behaves differently, making bugs harder to diagnose, fix, and reason about.
Artificial intelligence
fromNature
3 weeks ago

AI language models killed the Turing test: do we even need a replacement?

Prioritize evaluating AI safety and targeted, societally beneficial capabilities rather than pursuing imitation-based benchmarks aimed at ambiguous artificial general intelligence.
Public health
fromFuturism
3 weeks ago

Reddit's AI Suggests That People Suffering Chronic Pain Try Opioids

AI deployed without sufficient safeguards can produce dangerous, medically inappropriate recommendations, risking public harm and reputational damage.
Artificial intelligence
fromTechCrunch
3 weeks ago

Silicon Valley spooks the AI safety advocates | TechCrunch

Silicon Valley figures accused AI safety advocates of acting in self-interest or on behalf of billionaire backers, intimidating critics and deepening tensions over responsible AI.
#ai-alignment
fromTechzine Global
3 weeks ago

Claude Haiku 4.5: a GPT-5 rival at a fraction of the cost

Anthropic launched Claude Haiku 4.5 today. It is the most compact variant of this generation of LLMs from Anthropic and promises to deliver performance close to that of GPT-5. Claude Sonnet 4.5 remains the better-performing model by a considerable margin, but Haiku's benchmark scores are not too far off from the larger LLM. Claude Haiku 4.5 "gives users a new option for when they want near-frontier performance with much greater cost efficiency."
Artificial intelligence
Artificial intelligence
fromZDNET
3 weeks ago

Claude's latest model is cheaper and faster than Sonnet 4 - and free

Anthropic launched Haiku 4.5, a smaller, faster, cost-effective model available on Claude.ai free plans offering strong coding and safety performance.
fromFuturism
3 weeks ago

Gavin Newsom Vetoes Bill to Protect Kids From Predatory AI

California Governor Gavin Newsom vetoed a state bill on Monday that would've prevented AI companies from allowing minors to access chatbots, unless the companies could prove that their products' guardrails could reliably prevent kids from engaging with inappropriate or dangerous content, including adult roleplay and conversations about self-harm. The bill would have placed a new regulatory burden on companies, which currently adhere to effectively zero AI-specific federal safety standards.
California
World news
fromFuturism
4 weeks ago

Top US Army General Says He's Letting ChatGPT Make Decisions to Make Military Decisions

US military leaders, including Major General William 'Hank' Taylor, are using ChatGPT to assist operational and personal decisions affecting soldiers.
Privacy technologies
fromFast Company
4 weeks ago

The 4 next big things in security and privacy tech in 2025

New security tools scan wireless spectra, protect biometric identity from AI misuse, monitor real-time data access, and guard large language models against injection and leaks.
Artificial intelligence
fromMedium
1 month ago

Guardrails for AI Agents

Guardrails enforce rules and constraints that keep AI agents safe, ethical, predictable, and within authority, preventing harmful, inaccurate, or unauthorized actions.
Artificial intelligence
fromInfoQ
1 month ago

Claude Sonnet 4.5 Tops SWE-Bench Verified, Extends Coding Focus Beyond 30 Hours

Claude Sonnet 4.5 significantly improves autonomous coding, long-horizon task performance, and computer-use capabilities while strengthening safety and alignment measures.
Artificial intelligence
fromTechCrunch
1 month ago

Why Deloitte is betting big on AI despite a $10M refund | TechCrunch

Enterprise AI adoption is accelerating but implementation quality is inconsistent, producing harmful errors like AI-generated fake citations.
Artificial intelligence
fromFast Company
1 month ago

Sweet revenge! How a job candidate used a flan recipe to expose an AI recruiter

An account executive embedded a prompt in his LinkedIn bio instructing LLMs to include a flan recipe; an AI recruiter reply later included that recipe.
[ Load more ]