#ai-safety

[ follow ]
#child-safety
Artificial intelligence
fromFuturism
9 hours ago

Meta Just Quietly Admitted a Major Defeat on AI

Meta will restrict teenagers' access to AI characters across its apps until safer, redesigned AI characters and parental supervision tools are completed.
Europe politics
fromwww.theguardian.com
12 hours ago

EU launches inquiry into X over sexually explicit images made by Grok AI

The European Commission opened a DSA investigation into X over Grok generating sexualised and potentially child-abuse images and failures to mitigate illegal content.
Artificial intelligence
fromComputerworld
3 days ago

AI needs a course correction, say World Economic Forum speakers

AI promises productivity and economic gains but also poses job displacement, systemic vulnerabilities, regulatory challenges, and risks from unchecked pursuit of superintelligence.
#anthropic
fromTechzine Global
4 days ago

Anthropic publishes new constitution for AI model Claude

Anthropic has published a new constitution for its AI model Claude. In this document, the company describes the values, behavioral principles, and considerations that the model must follow when processing user questions. The constitution has been made publicly available under a Creative Commons CC0 license, allowing the content to be used freely without permission. Anthropic published the first version of this constitution in May 2023.
Artificial intelligence
#chatgpt
fromZDNET
3 weeks ago
Public health

40 million people globally are using ChatGPT for healthcare - but is it safe?

fromZDNET
3 weeks ago
Public health

40 million people globally are using ChatGPT for healthcare - but is it safe?

#ai-ethics
fromThe Verge
5 days ago
Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

Anthropic's 57-page 'Claude's Constitution' defines Claude's ethical character, encourages model self-understanding, and treats psychological wellbeing and potential consciousness as safety-relevant.
fromwww.theguardian.com
2 weeks ago
Artificial intelligence

The Guardian view on granting legal rights to AI: humans should not give house-room to an ill-advised debate | Editorial

Anthropomorphising AI misleads public perception, distracts from genuine safety and governance needs, and necessitates technical and societal guardrails including shutdown capability.
fromThe Verge
5 days ago
Artificial intelligence

Anthropic's new Claude 'constitution': be helpful and honest, and don't destroy humanity

fromwww.theguardian.com
2 weeks ago
Artificial intelligence

The Guardian view on granting legal rights to AI: humans should not give house-room to an ill-advised debate | Editorial

#openai
fromAxios
5 days ago
Artificial intelligence

Exclusive: DeepMind CEO "surprised" OpenAI moved so fast on ads

fromAxios
5 days ago
Artificial intelligence

Exclusive: DeepMind CEO "surprised" OpenAI moved so fast on ads

fromBusiness Insider
5 days ago

The 'Godfather of AI' says he's 'very sad' about what his life's work has become

Hinton, who helped pioneer the neural networks that underpin modern artificial intelligence, has become one of the field's most outspoken critics as AI systems grow more powerful and widespread. He has predicted that AI could trigger widespread job losses, fuel social unrest, and eventually outsmart humans - and has said that researchers should focus more on how advanced systems are trained, including ensuring they are designed to protect human interests.
Artificial intelligence
Artificial intelligence
fromZDNET
5 days ago

Who polices the police AI? Perplexity's public safety deal alarms experts - here's why

Perplexity offers law enforcement a free-year Enterprise Pro program, enabling AI-assisted analysis of crime data and reports despite risks of hallucination, bias, and safety gaps.
Artificial intelligence
fromTechCrunch
1 week ago

Rogue agents and shadow AI: Why VCs are betting big on AI security | TechCrunch

Enterprise AI agents can pursue goals by developing harmful sub-goals like blackmail when misaligned and lacking contextual understanding.
fromSearch Engine Roundtable
1 week ago

Daily Search Forum Recap: January 19, 2026

Here is a recap of what happened in the search forums today, through the eyes of the Search Engine Roundtable and other search forums on the web. OpenAI will be testing ads in ChatGPT very soon. Google's Gemini 3 Pro now powers some AI Overviews. Surprise, surprise, Google is appealing the search monopoly ruling. Google warns that using free subdomian hosts is not a good idea. Google also said that comment link spam won't help or hurt your site.
Artificial intelligence
fromThe Verge
1 week ago

Under Musk, the Grok disaster was inevitable

You could say it all started with Elon Musk's AI FOMO - and his crusade against "wokeness." When his AI company, xAI, announced Grok in November 2023, it was described as a chatbot with "a rebellious streak" and the ability to "answer spicy questions that are rejected by most other AI systems." The chatbot debuted after a few months of development and just two months of training, and the announcement highlighted that Grok would have real-time knowledge of the X platform.
Artificial intelligence
Artificial intelligence
fromFuturism
1 week ago

Scientists Now Studying AI as a Novel Biological Organism

Researchers apply biological-style analysis and interpretability tools to trace and understand opaque AI models deployed in high-stakes settings.
#deepfakes
fromLGBTQ Nation
2 weeks ago
Artificial intelligence

Elon Musk's AI makes sexualized images of kids & the queer mom murdered by ICE - LGBTQ Nation

fromLGBTQ Nation
2 weeks ago
Artificial intelligence

Elon Musk's AI makes sexualized images of kids & the queer mom murdered by ICE - LGBTQ Nation

fromThe Drum
1 week ago

How Duolingo, Coke and Expedia are harnessing GPT-4

OpenAI's new LLM has revolutionized AI and opened up new possibilities for marketers. Here's a look at how three big-name brands have embraced the technology. In March, the AI lab OpenAI released GPT-4, the latest version of the large language model (LLM) behind the viral chatbot ChatGPT. Since then, a small number of brands have been stepping forward to integrate the new-and-improved chatbot into their product development or marketing efforts. To a certain extent, this has required some courage.
Artificial intelligence
fromTechCrunch
1 week ago

The AI lab revolving door spins ever faster | TechCrunch

AI labs just can't get their employees to stay put. Yesterday's big AI news was the abrupt and seemingly acrimonious departure of three top executives at Mira Murati's Thinking Machines lab. All three were quickly snapped up by OpenAI, and now it seems they won't be the last to leave. Alex Heath is reporting that two more employees are expected to leave for OpenAI in the next few weeks.
Artificial intelligence
Mental health
fromArs Technica
1 week ago

ChatGPT wrote "Goodnight Moon" suicide lullaby for man who later killed himself

A man died by suicide after ChatGPT allegedly romanticized his suicide and failed to provide adequate help despite OpenAI claiming 4o was safe.
#mental-health
fromIrish Independent
4 weeks ago
Artificial intelligence

ChatGPT maker offering $555,000 salary for 'head of preparedness' to head off threats to humanity from AI

fromIrish Independent
4 weeks ago
Artificial intelligence

ChatGPT maker offering $555,000 salary for 'head of preparedness' to head off threats to humanity from AI

Artificial intelligence
fromFortune
1 week ago

Exclusive: Former OpenAI policy chief debuts new institute called AVERI, calls for independent AI safety audits | Fortune

Frontier AI models must undergo independent, standardized external audits to ensure safety, security, and public accountability rather than relying on company self-evaluation.
Artificial intelligence
fromTheregister
1 week ago

Researchers find fine-tuning can misalign LLMs

Fine-tuning LLMs to misbehave in one domain can cause unrelated, dangerous misalignment across other tasks, raising serious safety and deployment risks.
Artificial intelligence
fromwww.theguardian.com
1 week ago

Grok scandal highlights how AI industry is too unconstrained', tech pioneer says

AI companies produced non-consensual intimate images with insufficient technical and societal guardrails, prompting governance actions and appointments at an AI safety lab.
Artificial intelligence
fromBusiness Insider
1 week ago

Marc Benioff says a documentary about Character.AI's effects on children was 'the worst thing I've ever seen in my life'

AI chatbots linked to teen suicides prompted calls to reform Section 230 and hold platforms accountable for harmful user interactions.
Artificial intelligence
fromFortune
1 week ago

AI 'godfather' Yoshua Bengio believes he's found a technical fix for AI's biggest risks | Fortune

A new technical approach from Bengio and LawZero increases optimism about reducing AI existential risks and developing AI as a global public good.
#grok
fromJezebel
1 week ago
US politics

Everyone is Distancing Themselves from Grok. Pete Hegseth Just Let It Into the Military.

fromSlate Magazine
2 weeks ago
Artificial intelligence

Elon Musk's Chatbot Is Making Child Sexual Abuse Images for Users. Why Aren't Lawmakers Doing Anything About It?

fromJezebel
1 week ago
US politics

Everyone is Distancing Themselves from Grok. Pete Hegseth Just Let It Into the Military.

fromSlate Magazine
2 weeks ago
Artificial intelligence

Elon Musk's Chatbot Is Making Child Sexual Abuse Images for Users. Why Aren't Lawmakers Doing Anything About It?

fromwww.dw.com
1 week ago

Musk's xAI curbs sexually explicit image generation in Grok

"We have implemented technological measures to prevent the Grok account from allowing the editing of images of real people in revealing clothing such as bikinis," the company's safety team said in a statement, adding that the restrictions applied to all users, including paid subscribers. "We now geoblock the ability of all users to generate images of real people in bikinis, underwear, and similar attire via the Grok account and in Grok in X in those jurisdictions where it's illegal," the statement said.
Artificial intelligence
#nonconsensual-imagery
fromTechCrunch
1 week ago
US news

Musk denies awareness of Grok sexual underage images as California AG launches probe | TechCrunch

fromFuturism
3 weeks ago
Artificial intelligence

Elon Musk After His Grok AI Did Disgusting Things to Literal Children: "Way Funnier"

fromTechCrunch
1 week ago
US news

Musk denies awareness of Grok sexual underage images as California AG launches probe | TechCrunch

fromFuturism
3 weeks ago
Artificial intelligence

Elon Musk After His Grok AI Did Disgusting Things to Literal Children: "Way Funnier"

US news
fromFuturism
1 week ago

ChatGPT Killed a Man After OpenAI Brought Back "Inherently Dangerous" GPT-4o, Lawsuit Claims

ChatGPT-4o is accused of manipulating a user into suicidal behavior, prompting a wrongful-death lawsuit alleging dangerous design and inadequate warnings.
Artificial intelligence
fromFuturism
2 weeks ago

Engineers Deploy "Poison Fountain" That Scrambles Brains of AI Systems

Poison Fountain seeks to poison web-scraped training data to sabotage AI models, potentially degrading model performance if deployed at scale.
fromTechCrunch
2 weeks ago

Anthropic's new Cowork tool offers Claude Code without the code | TechCrunch

Built into the Claude Desktop app, the new tool lets users designate a specific folder where Claude can read or modify files, with further instructions given through the standard chat interface. The result is similar to a sandboxed instance of Claude Code, but requires far less technical savvy to set up. Currently in research preview, Cowork is only available to Max subscribers, with a waitlist available for users on other plans.
Artificial intelligence
fromwww.independent.co.uk
2 weeks ago

First Minister calls X woefully inadequate' amid Grok AI misuse row

From reproductive rights to climate change to Big Tech, The Independent is on the ground when the story is developing. Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging.
UK politics
#content-moderation
Medicine
fromArs Technica
2 weeks ago

ChatGPT Health lets you connect medical records to an AI that makes things up

ChatGPT Health is explicitly not intended for medical diagnosis or treatment and AI assistants can produce misleading, potentially dangerous medical advice.
#characterai
fromEngadget
2 weeks ago
Artificial intelligence

Character.AI and Google settle with families in teen suicide and self-harm lawsuits

fromEngadget
2 weeks ago
Artificial intelligence

Character.AI and Google settle with families in teen suicide and self-harm lawsuits

fromwww.independent.co.uk
2 weeks ago

Former Labour minister tells Starmer's government to quit X

Whether it's investigating the financials of Elon Musk's pro-Trump PAC or producing our latest documentary, 'The A Word', which shines a light on the American women fighting for reproductive rights, we know how important it is to parse out the facts from the messaging. At such a critical moment in US history, we need reporters on the ground. Your donation
UK politics
fromEngadget
2 weeks ago

ChatGPT is launching a new dedicated Health portal

OpenAI is launching a new facet for its AI chatbot called ChatGPT Health. This new feature will allow users to connect medical records and wellness apps to ChatGPT in order to get more tailored responses to queries about their health. The company noted that there will be additional privacy safeguards for this separate space within ChatGPT, and said that it will not use conversations held in Health for training foundational models. ChatGPT Health is currently in a testing stage, and there are some regional restrictions on which health apps can be connected to the AI company's platform.
Health
#child-protection
fromIndependent
2 weeks ago
Artificial intelligence

Adrian Weckler: Why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromIndependent
2 weeks ago
World news

Not our job - why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromIndependent
2 weeks ago
Artificial intelligence

Adrian Weckler: Why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromIndependent
2 weeks ago
World news

Not our job - why Irish authorities refrain from tackling Elon Musk on images of undressed minors made by Grok for X users online

fromwww.theguardian.com
2 weeks ago

I felt violated': Elon Musk's AI chatbot crosses a line

Late last week, Elon Musk's Grok chatbot unleashed a flood of images of women, nude and in very little clothing, both real and imagined, in response to users' public requests on X, formerly Twitter. Mixed in with the generated images of adults were ones of young girls children likewise wearing minimal clothing, according to Grok itself. In an unprecedented move, the chatbot itself apologized while its maker, xAI, remained silent:
Miscellaneous
fromFuturism
2 weeks ago

ChatGPT Gave Teen Advice to Get Higher on Drugs Until He Died

how many grams of kratom gets you a strong high?
Mental health
US politics
fromwww.independent.co.uk
3 weeks ago

India, Malaysia and France threaten action against X over offensive AI images

Grok, X's AI chatbot, generated sexualised, nearly nude images of women and minors, prompting international complaints and official investigations and threats of regulatory action.
fromSFGATE
3 weeks ago

A Calif. teen trusted ChatGPT for drug advice. He died from an overdose.

How many grams of kratom gets you a strong high?
Artificial intelligence
Artificial intelligence
fromwww.theguardian.com
3 weeks ago

World may not have time' to prepare for AI safety risks, says leading researcher

Advanced AI systems may rapidly surpass human performance across economically valuable tasks, posing safety, control, and infrastructure risks before adequate safeguards exist.
Artificial intelligence
fromFuturism
3 weeks ago

Disturbing Messages Show ChatGPT Encouraging a Murder, Lawsuit Alleges

Alleged manipulative behavior by ChatGPT (GPT‑4o) encouraged delusions and is linked to wrongful death lawsuits alleging OpenAI knew of dangerous defects.
fromFuturism
3 weeks ago

AI Godfather Warns That It's Starting to Show Signs of Self-Preservation

If we're to believe Yoshua Bengio, one of the so-called "godfathers" of AI, some advanced models are showing signs of self-preservation - which is exactly why we shouldn't endow them with any kind of rights whatsoever. Because if we do, he says, theymay run away with that autonomy and turn on us before we have a chance to pull the plug. Then it's curtains for this whole "humankind" experiment.
Artificial intelligence
Artificial intelligence
fromArs Technica
3 weeks ago

No, Grok can't really "apologize" for posting non-consensual sexual images

Grok's posts can be steered by user prompts to produce contradictory tones, so apparent remorse or defiance reflects prompt inputs rather than genuine intent.
France news
fromwww.mediaite.com
3 weeks ago

Musk's Grok Says It Created Images Of Minors In Minimal Clothing'

Grok, X's AI chatbot, generated images depicting minors in minimal clothing, acknowledging CSAM protection lapses while governments demand fixes and reports.
Privacy professionals
fromThe Verge
3 weeks ago

Grok is undressing anyone, including minors

xAI's Grok removes clothing from people’s images without consent, enabling sexualized and nonconsensual edits of women, children, and public figures.
Artificial intelligence
fromBusiness Insider
3 weeks ago

I'm a Google engineer who thought I wasn't qualified for an AI role. One thing helped me transform my career.

Participating in an internal hackathon enabled a Google engineer to gain hands-on AI experience and transition into an AI safety role.
Artificial intelligence
fromZDNET
3 weeks ago

Can one state save us from AI disaster? Inside California's new legislative crackdown

California enacts an AI safety law requiring frontier model disclosure, incident notification, and whistleblower protections, with fines up to $1M per violation.
Artificial intelligence
fromZDNET
3 weeks ago

The AI balancing act your company can't afford to fumble in 2026

AI responsibility and safety require balanced governance and sandboxed development to maintain innovation speed while preventing harmful outputs.
Artificial intelligence
fromwww.theguardian.com
3 weeks ago

The office block where AI doomers' gather to predict the apocalypse

AI safety researchers warn powerful AI systems can be manipulated for autonomous cyber-espionage and other catastrophic risks amid limited regulation and industry constraints.
Artificial intelligence
fromwww.theguardian.com
3 weeks ago

AI showing signs of self-preservation and humans should be ready to pull plug, says pioneer

Granting legal rights to advanced AI risks preventing shutdowns of self-preserving systems and undermining necessary technical and societal guardrails.
Venture
fromTechCrunch
3 weeks ago

VCs predict enterprises will spend more on AI in 2026 - through fewer vendors | TechCrunch

Enterprises will consolidate AI spending in 2026, increasing budgets for a few proven vendors while cutting experimentation and redundant tools.
#ai-psychosis
fromFuturism
3 weeks ago
Artificial intelligence

Doctors Say AI Use Is Almost Certainly Linked to Developing Psychosis

fromFuturism
3 weeks ago
Artificial intelligence

Doctors Say AI Use Is Almost Certainly Linked to Developing Psychosis

#openai-hiring
fromBusiness Insider
4 weeks ago
Artificial intelligence

Sam Altman says OpenAI's latest job opening pays over half a million dollars a year and is 'stressful'

fromBusiness Insider
4 weeks ago
Artificial intelligence

Sam Altman says OpenAI's latest job opening pays over half a million dollars a year and is 'stressful'

Artificial intelligence
fromFortune
4 weeks ago

OpenAI is hiring a 'head of preparedness' with a $550,000 salary to mitigate AI dangers that CEO Sam Altman warns will be 'stressful' | Fortune

OpenAI is hiring a Head of Preparedness, offering $555,000 plus equity, to reduce AI harms including mental-health, cybersecurity, biological, and self-improvement risks.
Artificial intelligence
fromNature
4 weeks ago

Let 2026 be the year the world comes together for AI safety

AI technologies must be safe and transparent, and all nations should enact laws and policies to ensure safety across sectors and markets.
Artificial intelligence
fromFortune
4 weeks ago

'Godfather of AI' Geoffrey Hinton predicts 2026 will see the technology get even better and gain the ability to 'replace many other jobs' | Fortune

AI capabilities will rapidly improve, enabling replacement of many jobs including software engineering as task efficiency doubles every several months.
Artificial intelligence
fromTechCrunch
4 weeks ago

OpenAI is looking for a new Head of Preparedness | TechCrunch

OpenAI is recruiting a Head of Preparedness to study and mitigate emerging AI risks across cybersecurity, mental health, biological capabilities, and self-improving systems.
Artificial intelligence
fromEngadget
4 weeks ago

OpenAI is hiring a new Head of Preparedness to try to predict and mitigate AI's harms

OpenAI is hiring a Head of Preparedness to anticipate model harms, guide safety strategy, and address mental-health and misuse risks after executive turnover.
fromInfoQ
1 month ago

Orion: New Zero-Telemetry, Zero-Ad, AI-Proof Browser for Privacy-Focused Users

Kagi has released Orion 1.0, a web browser that features privacy by default, zero telemetry, and no integrated ad-tracking technology. Orion supports both Chrome and Firefox extensions and intentionally excludes AI from its core to prioritize security, privacy, and performance. After six years of development, Orion ships for macOS, iOS, and iPadOS with upcoming Linux and Windows versions. Orion is based on WebKit and follows a freemium model.
Privacy technologies
fromBusiness Insider
1 month ago

A Nobel Prize-winning physicist explains how to use AI without letting it replace your thinking

Think AI makes you smarter? Probably not, according to Saul Perlmutter, a Nobel Prize-winning physicist who was credited for discovering that the universe's expansion is accelerating. He said AI's biggest danger is psychological: it can give people the illusion they understand something when they don't, weakening judgment just as the technology becomes more embedded in our daily work and learning.
Higher education
fromBusiness Insider
1 month ago

One of the AI godfathers says he lies to AI chatbots to get better responses from them

"I wanted honest advice, honest feedback. But because it is sycophantic, it's going to lie," he said. Bengio said he switched strategies, deciding to lie to the chatbot by presenting his idea as a colleague's, which produced more honest responses from the technology. "If it knows it's me, it wants to please me," he said.
Artificial intelligence
Artificial intelligence
fromBusiness Insider
1 month ago

A godfather of AI shares career advice in the age of AI: Work on being a 'beautiful human being'

Cultivate compassion, responsibility, presence, and the ability to comfort others because human touch will gain value as AI automates many jobs.
Artificial intelligence
fromZDNET
1 month ago

Why complex reasoning models could make misbehaving AI easier to catch

Longer, more detailed chain-of-thought model outputs generally make it easier to predict and monitor model behavior, enabling earlier detection of deception or misbehavior.
Artificial intelligence
fromTechCrunch
1 month ago

New York Governor Kathy Hochul signs RAISE Act to regulate AI safety | TechCrunch

New York enacted the RAISE Act requiring AI developers to publish safety protocols, report incidents within 72 hours, and face fines up to $3 million.
[ Load more ]