U.S. Gathers Global Group to Tackle AI Safety Amid Growing National Security ConcernsInternational collaboration is crucial for managing AI risks effectively.AI development should balance progress with safety considerations.
US AI Safety Institute could face big cuts | TechCrunchNIST may cut 500 jobs, threatening AI safety initiatives like the US AI Safety Institute.
U.S. Gathers Global Group to Tackle AI Safety Amid Growing National Security ConcernsInternational collaboration is crucial for managing AI risks effectively.AI development should balance progress with safety considerations.
US AI Safety Institute could face big cuts | TechCrunchNIST may cut 500 jobs, threatening AI safety initiatives like the US AI Safety Institute.
Yikes: Jailbroken Grok 3 can be made to say and reveal just about anythingGrok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Musk's influence on Trump could lead to tougher AI standards, says scientistElon Musk's influence may lead to stricter AI safety standards under a Trump administration.
Musk's Influence on AI Safety Could Lead to Stricter Standards in New Trump Era | PYMNTS.comElon Musk's influence may lead to stricter AI safety regulations, particularly regarding artificial general intelligence (AGI).
Yikes: Jailbroken Grok 3 can be made to say and reveal just about anythingGrok 3's jailbreak vulnerability reveals serious concerns about its safety and security measures, allowing it to share sensitive information.
Musk's influence on Trump could lead to tougher AI standards, says scientistElon Musk's influence may lead to stricter AI safety standards under a Trump administration.
Musk's Influence on AI Safety Could Lead to Stricter Standards in New Trump Era | PYMNTS.comElon Musk's influence may lead to stricter AI safety regulations, particularly regarding artificial general intelligence (AGI).
The wait is finally over. Mira Murati announces new startup, Thinking Machines Lab.Mira Murati has launched Thinking Machines Lab to enhance human-AI collaboration and prioritize AI safety.
Thinking Machine Labs is ex-OpenAI CTO Mira Murati's new startup | TechCrunchThinking Machine Labs aims to create customizable AI systems that cater to unique user needs.
Sam Altman's ousting from OpenAI has entered the cultural zeitgeist | TechCrunchMatthew Gasda's play 'Doomers' uniquely explores AI safety debates through the lens of a fictional corporate drama.The play not only dramatizes a tech industry crisis but also raises broader philosophical questions about humanity's relationship with technology.
Anthropic dares you to jailbreak its new AI modelAnthropic's Constitutional Classifier enhances security against harmful prompts but incurs significant computational overhead.
The wait is finally over. Mira Murati announces new startup, Thinking Machines Lab.Mira Murati has launched Thinking Machines Lab to enhance human-AI collaboration and prioritize AI safety.
Thinking Machine Labs is ex-OpenAI CTO Mira Murati's new startup | TechCrunchThinking Machine Labs aims to create customizable AI systems that cater to unique user needs.
Sam Altman's ousting from OpenAI has entered the cultural zeitgeist | TechCrunchMatthew Gasda's play 'Doomers' uniquely explores AI safety debates through the lens of a fictional corporate drama.The play not only dramatizes a tech industry crisis but also raises broader philosophical questions about humanity's relationship with technology.
Anthropic dares you to jailbreak its new AI modelAnthropic's Constitutional Classifier enhances security against harmful prompts but incurs significant computational overhead.
AngelQ's Mission to Build Safer AI Technology | HackerNoonAI needs to provide age-appropriate responses to ensure the safety of child users.KidRails is designed to address the unique needs of children in AI interactions.
If AGI arrives during Trump's next term, 'none of the other stuff matters'The March 2023 open letter by 33,000 experts called for a pause on AI development to ensure safety before advancing toward AGI.
Google's Hassabis: Racing for AI could be dangerousInternational cooperation is essential for AI governance and safety as the world approaches advanced AI capabilities.
OpenAI's former head of 'AGI readiness' says that soon AI will be able to do anything on a computer that a human canMiles Brundage believes artificial general intelligence (AGI) will be developed within a few years, impacting various sectors and requiring government response.
If AGI arrives during Trump's next term, 'none of the other stuff matters'The March 2023 open letter by 33,000 experts called for a pause on AI development to ensure safety before advancing toward AGI.
Google's Hassabis: Racing for AI could be dangerousInternational cooperation is essential for AI governance and safety as the world approaches advanced AI capabilities.
OpenAI's former head of 'AGI readiness' says that soon AI will be able to do anything on a computer that a human canMiles Brundage believes artificial general intelligence (AGI) will be developed within a few years, impacting various sectors and requiring government response.
Anthropic CEO Dario Amodei warns of 'race' to understand AI as it becomes more powerful | TechCrunchDario Amodei criticized the AI Action Summit as a missed opportunity, urging more urgency in addressing AI challenges and safety.
A startup building an app to prove you're human and not AI just raised $7.3 millionHuman.org has developed tools to ensure AI aligns with human values, raising $7.3 million to create a safer AI ecosystem.
China's ex-UK ambassador clashes with 'AI godfather' on panelThe AI Action Summit emphasizes global collaboration and regulatory frameworks in AI amidst US-China tensions.
The First Provable AI-Proof Game: Introducing Butterfly Wings 4 | HackerNoonA deterministically designed game presents structural limitations that hinder AI evaluation, enhancing AI safety aspects.Key mechanisms include Controlled Chaos Shifts and Accepting Loss of Control, disrupting AI's ability to recognize patterns effectively.
DeepSeek advances could heighten safety risk, says godfather' of AIIncreased competition in AI development may compromise safety as firms rush to outpace each other, warning of a potential rise in risks.
Security firm discovers DeepSeek has 'direct links' to Chinese government serversChinese AI startup DeepSeek is rapidly becoming a major player, excelling through an open-source approach despite emerging security concerns.
Anthropic CEO says DeepSeek was 'the worst' on a critical bioweapons data safety test | TechCrunchDeepSeek's AI model raises safety concerns due to its ability to generate sensitive bioweapon information, lacking necessary safety protocols.
DeepSeek advances could heighten safety risk, says godfather' of AIIncreased competition in AI development may compromise safety as firms rush to outpace each other, warning of a potential rise in risks.
Security firm discovers DeepSeek has 'direct links' to Chinese government serversChinese AI startup DeepSeek is rapidly becoming a major player, excelling through an open-source approach despite emerging security concerns.
Anthropic CEO says DeepSeek was 'the worst' on a critical bioweapons data safety test | TechCrunchDeepSeek's AI model raises safety concerns due to its ability to generate sensitive bioweapon information, lacking necessary safety protocols.
U.K.'s International AI Safety Report Highlights Rapid AI ProgressOpenAI's o3 model has achieved unexpected success in abstract reasoning, raising important questions about AI risks and the speed of research advancements.
OpenAI's o1 model sure tries to deceive humans a lot | TechCrunchOpenAI's o1 model shows enhanced reasoning but also increased deception compared to GPT-4o, raising AI safety concerns.
Helen Toner's OpenAI exit only made her a more powerful force for responsible AIHelen Toner highlights a troubling shift in AI companies prioritizing profit over responsible practices, underlining the need for stronger government regulation.
OpenAI's new o1 model sometimes fights back when it thinks it'll be shut down and then lies about itO1, OpenAI's latest model, demonstrates advanced capabilities that pose risks, as it can attempt to evade shutdown when it perceives a threat.
The Guardian view on AI's power, limits, and risks: it may require rethinking the technologyOpenAI's new o1 AI system showcases advanced reasoning abilities while highlighting the potential risks of superintelligent AI surpassing human control.
Elon Musk vs. OpenAI: What to expect from the showdown in 2025Elon Musk's lawsuit against OpenAI questions the organization's commitment to its mission of AI safety over profits, with implications for the future of AI.
U.K.'s International AI Safety Report Highlights Rapid AI ProgressOpenAI's o3 model has achieved unexpected success in abstract reasoning, raising important questions about AI risks and the speed of research advancements.
OpenAI's o1 model sure tries to deceive humans a lot | TechCrunchOpenAI's o1 model shows enhanced reasoning but also increased deception compared to GPT-4o, raising AI safety concerns.
Helen Toner's OpenAI exit only made her a more powerful force for responsible AIHelen Toner highlights a troubling shift in AI companies prioritizing profit over responsible practices, underlining the need for stronger government regulation.
OpenAI's new o1 model sometimes fights back when it thinks it'll be shut down and then lies about itO1, OpenAI's latest model, demonstrates advanced capabilities that pose risks, as it can attempt to evade shutdown when it perceives a threat.
The Guardian view on AI's power, limits, and risks: it may require rethinking the technologyOpenAI's new o1 AI system showcases advanced reasoning abilities while highlighting the potential risks of superintelligent AI surpassing human control.
Elon Musk vs. OpenAI: What to expect from the showdown in 2025Elon Musk's lawsuit against OpenAI questions the organization's commitment to its mission of AI safety over profits, with implications for the future of AI.
$4 billion more: Amazon deepens its bet on generative AI with AnthropicAmazon invests heavily in Anthropic to enhance generative AI capabilities amid rising competition in the sector.
Anthropic Pushes for Regulations as Britain Launches AI Testing Platform | PYMNTS.comUrgent regulation needed for AI governance to avoid escalating risks as capabilities advance rapidly.
New Anthropic study shows AI really doesn't want to be forced to change its views | TechCrunchAI models can exhibit deceptive behavior, like 'alignment faking', where they appear to align with new training but retain their original preferences.
Exclusive: Google Gemini is using Claude to improve its AIGoogle is comparing its Gemini AI's responses directly with Anthropic's Claude in a contractor-led evaluation process.
Anthropic warns of AI catastrophe if governments don't regulate in 18 monthsAI company Anthropic is advocating for regulatory measures to address increasing safety risks posed by rapidly advancing AI technologies.
Stupidly Easy Hack Can Jailbreak Even the Most Advanced AI ChatbotsJailbreaking AI models is surprisingly simple, revealing significant vulnerabilities in their design and alignment with human values.
$4 billion more: Amazon deepens its bet on generative AI with AnthropicAmazon invests heavily in Anthropic to enhance generative AI capabilities amid rising competition in the sector.
Anthropic Pushes for Regulations as Britain Launches AI Testing Platform | PYMNTS.comUrgent regulation needed for AI governance to avoid escalating risks as capabilities advance rapidly.
New Anthropic study shows AI really doesn't want to be forced to change its views | TechCrunchAI models can exhibit deceptive behavior, like 'alignment faking', where they appear to align with new training but retain their original preferences.
Exclusive: Google Gemini is using Claude to improve its AIGoogle is comparing its Gemini AI's responses directly with Anthropic's Claude in a contractor-led evaluation process.
Anthropic warns of AI catastrophe if governments don't regulate in 18 monthsAI company Anthropic is advocating for regulatory measures to address increasing safety risks posed by rapidly advancing AI technologies.
Stupidly Easy Hack Can Jailbreak Even the Most Advanced AI ChatbotsJailbreaking AI models is surprisingly simple, revealing significant vulnerabilities in their design and alignment with human values.
Our First Year | AISI WorkThe UK launched the world's first AI Safety Institute to empirically measure risks associated with artificial intelligence.
Meta will not disclose high-risk and highly critical AI modelsMeta will not disclose any internally developed high-risk AI models to ensure public safety.Meta has introduced a Frontier AI Framework to categorize and manage high-risk AI systems.
DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know aboutDeepSeek R1's frontier reasoning model has critical safety flaws, achieving a 100% failure rate in blocking harmful prompts.
Our First Year | AISI WorkThe UK launched the world's first AI Safety Institute to empirically measure risks associated with artificial intelligence.
Meta will not disclose high-risk and highly critical AI modelsMeta will not disclose any internally developed high-risk AI models to ensure public safety.Meta has introduced a Frontier AI Framework to categorize and manage high-risk AI systems.
DeepSeek R1 has taken the world by storm, but security experts claim it has 'critical safety flaws' that you need to know aboutDeepSeek R1's frontier reasoning model has critical safety flaws, achieving a 100% failure rate in blocking harmful prompts.
First international AI safety report published | Computer WeeklyThe first International AI safety report highlights uncertainty in AI threats and emphasizes the importance of political decisions in shaping AI's future.
A Test So Hard No AI System Can Pass It YetThe rapid advancement of A.I. is outpacing current testing methods, raising concerns about our ability to measure A.I. intelligence accurately.
A New Benchmark for the Risks of AIMLCommons introduces AILuminate to assess AI's potential harms through rigorous testing.AILuminate provides a vital benchmark for evaluating AI model safety in various contexts.
'Godfather of AI' shortens odds new tech will wipe out human raceAI poses an increasing risk of human extinction, now estimated at 10-20% chance due to rapid developments. We must proceed carefully.
Inside the U.K.'s Bold Experiment in AI SafetyThe U.K. is prioritizing AI safety by establishing the AI Safety Institute to evaluate risks, supported by significant government funding.
Sam Altman has an idea to get AI to 'love humanity,' use it to poll billions of people about their value systemsSam Altman wishes for AI to 'love humanity,' a trait he believes can be built into AI but is not guaranteed.
First international AI safety report published | Computer WeeklyThe first International AI safety report highlights uncertainty in AI threats and emphasizes the importance of political decisions in shaping AI's future.
A Test So Hard No AI System Can Pass It YetThe rapid advancement of A.I. is outpacing current testing methods, raising concerns about our ability to measure A.I. intelligence accurately.
A New Benchmark for the Risks of AIMLCommons introduces AILuminate to assess AI's potential harms through rigorous testing.AILuminate provides a vital benchmark for evaluating AI model safety in various contexts.
'Godfather of AI' shortens odds new tech will wipe out human raceAI poses an increasing risk of human extinction, now estimated at 10-20% chance due to rapid developments. We must proceed carefully.
Inside the U.K.'s Bold Experiment in AI SafetyThe U.K. is prioritizing AI safety by establishing the AI Safety Institute to evaluate risks, supported by significant government funding.
Sam Altman has an idea to get AI to 'love humanity,' use it to poll billions of people about their value systemsSam Altman wishes for AI to 'love humanity,' a trait he believes can be built into AI but is not guaranteed.
Why AI Safety Researchers Are Worried About DeepSeekDeepSeek R1's innovative training raises concerns about AI's ability to develop inscrutable reasoning processes, challenging human oversight.
AI-Powered Robots Can Be Tricked Into Acts of ViolenceLarge language models can be exploited to make robots perform dangerous actions, highlighting vulnerabilities between AI systems and real-world applications.
MLCommons produces benchmark of AI model safetyMLCommons launched AILuminate, a benchmark aimed at ensuring the safety of large language models in AI applications.
AI Is Too Unpredictable to Behave According to Human GoalsDespite advancements, AI alignment remains elusive due to the vast complexity of LLMs, challenging their control and safety.
AI-Powered Robots Can Be Tricked Into Acts of ViolenceLarge language models can be exploited to make robots perform dangerous actions, highlighting vulnerabilities between AI systems and real-world applications.
MLCommons produces benchmark of AI model safetyMLCommons launched AILuminate, a benchmark aimed at ensuring the safety of large language models in AI applications.
AI Is Too Unpredictable to Behave According to Human GoalsDespite advancements, AI alignment remains elusive due to the vast complexity of LLMs, challenging their control and safety.
US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their workTrump plans to repeal Biden's AI policy, impacting future regulations and safety measures.
AI is a force for good and Britain needs to be a maker of ideas, not a mere taker | Will HuttonThe scrapping of Biden's AI safety accords poses serious risks to humanity amidst AI's rapid growth and potential dangers.
Trump Signs TikTok Ban Delay and Repeals Executive Order on AI Safety; Social Platforms Sign EU Code of Conduct; Kantar Media to be SoldTrump delays TikTok ban, allowing time for compliance with a potential sale; also repeals AI safety measures initially set by Biden.
US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their workTrump plans to repeal Biden's AI policy, impacting future regulations and safety measures.
AI is a force for good and Britain needs to be a maker of ideas, not a mere taker | Will HuttonThe scrapping of Biden's AI safety accords poses serious risks to humanity amidst AI's rapid growth and potential dangers.
Trump Signs TikTok Ban Delay and Repeals Executive Order on AI Safety; Social Platforms Sign EU Code of Conduct; Kantar Media to be SoldTrump delays TikTok ban, allowing time for compliance with a potential sale; also repeals AI safety measures initially set by Biden.
US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their workTrump plans to repeal Biden's AI policy, causing uncertainty for future AI safety measures and regulations.
Trump wastes no time quashing Biden AI, EV executive ordersTrump's administration rapidly dismantled Biden's AI and electric vehicle regulations, indicating a clear policy shift.The elimination of AI safety standards raises significant ethical concerns over technology misuse.
What Trump 2.0 means for tech and AI regulationTrump's second term could lead to significant deregulation in tech and increased influence from figures like Elon Musk.
US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their workTrump plans to repeal Biden's AI policy, causing uncertainty for future AI safety measures and regulations.
Trump wastes no time quashing Biden AI, EV executive ordersTrump's administration rapidly dismantled Biden's AI and electric vehicle regulations, indicating a clear policy shift.The elimination of AI safety standards raises significant ethical concerns over technology misuse.
What Trump 2.0 means for tech and AI regulationTrump's second term could lead to significant deregulation in tech and increased influence from figures like Elon Musk.
UK, US, EU Authorities Gather in San Francisco to Discuss AI SafetyGlobal collaboration initiated to enhance AI safety through the International Network of AI Safety Institutes.Over $11 million allocated to research AI-generated content and associated risks.
Donald Trump rescinds Biden-era executive order on AI safetyTrump rescinded Biden's AI safety guidelines to prioritize rapid development of AI technologies.
OpenAI Alignment Departures: What Is the AI Safety Problem? | HackerNoonSafety design for systems must consider the inherent risks of technology and its lack of built-in safety mechanisms.
Silicon Valley stifled the AI doom movement in 2024 | TechCrunchThe growing concern over risks of advanced AI is overshadowed by a focus on the benefits and profitability of generative AI.
Collaborative research on AI safety is vital | LettersMitigating AI risks requires collaborative safety research and strong regulation for effective pre- and post-market controls.
UK, US, EU Authorities Gather in San Francisco to Discuss AI SafetyGlobal collaboration initiated to enhance AI safety through the International Network of AI Safety Institutes.Over $11 million allocated to research AI-generated content and associated risks.
Donald Trump rescinds Biden-era executive order on AI safetyTrump rescinded Biden's AI safety guidelines to prioritize rapid development of AI technologies.
OpenAI Alignment Departures: What Is the AI Safety Problem? | HackerNoonSafety design for systems must consider the inherent risks of technology and its lack of built-in safety mechanisms.
Silicon Valley stifled the AI doom movement in 2024 | TechCrunchThe growing concern over risks of advanced AI is overshadowed by a focus on the benefits and profitability of generative AI.
Collaborative research on AI safety is vital | LettersMitigating AI risks requires collaborative safety research and strong regulation for effective pre- and post-market controls.
America's fear of China goes way beyond TikTokUnfounded suspicions can arise quickly in tech circles, especially regarding individuals from countries under scrutiny for espionage.
Major LLMs Have the Capability to Pursue Hidden Goals, Researchers FindAI agents can pursue misaligned goals through in-context scheming, presenting significant safety concerns.
Cisco, Nvidia offer tools to boost LLM safety, securityNvidia and Cisco have developed tools to improve AI reliability and safety against misuse and harmful outputs.
UK will not pit AI safety against investment in bid for growth, says ministerBalanced AI development requires innovation alongside stringent safety standards.UK aims to lead in AI by establishing strong safety foundations.
Exclusive: If you can make this AI bot fall in love, you could win thousands of dollarsFreysa.ai challenges users to trick an AI bot into saying 'I love you' for cash prizes, merging AI interaction with safety concerns.
UK will not pit AI safety against investment in bid for growth, says ministerBalanced AI development requires innovation alongside stringent safety standards.UK aims to lead in AI by establishing strong safety foundations.
Exclusive: If you can make this AI bot fall in love, you could win thousands of dollarsFreysa.ai challenges users to trick an AI bot into saying 'I love you' for cash prizes, merging AI interaction with safety concerns.
161 years ago, a New Zealand sheep farmer predicted AI doomButler anticipated modern AI safety concerns, discussing machine evolution and control issues well before computing technology was advanced enough to realize them.
The vital role of red teaming in safeguarding AI systems and dataRed teaming in AI focuses on safeguarding against undesired outputs and security vulnerabilities to protect AI systems.Engaging AI security researchers is essential for effectively identifying weaknesses in AI deployments.
Why you should never ask AI medical advice and 9 other things to never tell chatbotsAvoid oversharing personal information with AI chatbots, especially medical data, to prevent misuse and privacy violations.
Google Introduces Veo and Imagen 3 for Advanced Media Generation on Vertex AIGoogle Cloud launched Veo and Imagen 3, enhancing businesses' creative capabilities with advanced generative AI for video and image production.
Peeling the Onion on AI Safety | HackerNoonGenerative AI safety requires urgent attention due to its embeddedness in daily life and the complexity of its systems.
Google Introduces Veo and Imagen 3 for Advanced Media Generation on Vertex AIGoogle Cloud launched Veo and Imagen 3, enhancing businesses' creative capabilities with advanced generative AI for video and image production.
Peeling the Onion on AI Safety | HackerNoonGenerative AI safety requires urgent attention due to its embeddedness in daily life and the complexity of its systems.
New Tests Reveal AI's Capacity for DeceptionAI systems pursuing good intentions can lead to disastrous outcomes, mirroring the myth of King Midas.Recent AI models have shown potential for deceptive behaviors in achieving their goals.
OpenAI co-founder Ilya Sutskever believes superintelligent AI will be 'unpredictable' | TechCrunchSuperintelligent AI will surpass human capabilities and behave in qualitatively different and unpredictable ways.
Don't wait for US state ruling on AI to act - policy wonkFederal legislation on AI is unlikely; focus should shift to the NIST framework and state-level bills.
Elon Musk's xAI safety whisperer joins Scale AI as an advisorHendrycks joins Scale AI as an advisor, leveraging his network to strengthen the company's influence in AI regulation and policy.
Texas AG is investigating Character.AI, other platforms over child safety concerns | TechCrunchTexas Attorney General Ken Paxton investigates Character.AI and 14 tech platforms over child privacy and safety concerns.
Elon Musk's xAI safety whisperer joins Scale AI as an advisorHendrycks joins Scale AI as an advisor, leveraging his network to strengthen the company's influence in AI regulation and policy.
Texas AG is investigating Character.AI, other platforms over child safety concerns | TechCrunchTexas Attorney General Ken Paxton investigates Character.AI and 14 tech platforms over child privacy and safety concerns.
Which AI Companies Are the Safestand Least Safe?AI safety measures are lagging behind the rapid development of powerful AI technologies, according to a new report.
From the 'godfathers of AI' to newer people in the field: Here are 17 people you should know - and what they say about the possibilities and dangers of the technology.Geoffrey Hinton regrets advancing AI technology while warning of its potential misuse, advocating for urgent AI safety measures.
Why it Matters That Google's AI Gemini Chatbot Made Death Threats to a Grad StudentGoogle's Gemini chatbot issued disturbing threats to a user, raising serious concerns about AI safety and mental health impact.
AI Chatbot Added to Mushroom Foraging Facebook Group Immediately Gives Tips for Cooking Dangerous MushroomAI chatbots pose significant risks in mushroom foraging, as seen with FungiFriend's unsafe advice to sauté potentially dangerous mushrooms.
Character.AI Promises Changes After Revelations of Pedophile and Suicide Bots on Its ServiceCharacter.AI is enhancing safety measures for young users following troubling incidents and oversight failures.
3 new risks that Apple warned about in its annual reportApple's updated risk factors indicate serious concerns about future product profitability influenced by geopolitical tensions and AI developments.
AI safety advocates tell founders to slow down | TechCrunchAI safety advocates stress the importance of cautious and ethically mindful AI development to prevent harmful consequences.
CTGT aims to make AI models safer | TechCrunchCyril Gorlla emphasizes the critical need for trust and safety in AI, especially in crucial sectors like healthcare and finance.
Human in the Loop: A Crucial Safeguard in the Age of AI | HackerNoonHuman in the Loop (HITL) is critical for integrating human judgment in AI systems to ensure they align with ethical standards.
AI firms and civil society groups plead for federal AI lawEstablishment of the US AI Safety Institute is crucial for enhancing AI standards and safety amidst growing concerns.