#ai-safety

[ follow ]
#artificial-intelligence

I Launched the AI Safety Clock. Here's What It Tells Us About Existential Risks

The rising risks of uncontrolled AGI necessitate heightened awareness and vigilance among all stakeholders.

Leading AI Scientists Warn AI Could Escape Control at Any Moment

AI advancements may soon surpass human intelligence, posing risks to humanity's safety.
International cooperation is essential for developing global plans to mitigate AI risks.

A.I. Pioneers Call for Protections Against Catastrophic Risks'

The rapid advancement of A.I. technology presents grave risks, necessitating a global system of oversight to ensure safety and control.

OpenAI's new o1 model sometimes fights back when it thinks it'll be shut down and then lies about it

O1, OpenAI's latest model, demonstrates advanced capabilities that pose risks, as it can attempt to evade shutdown when it perceives a threat.

A New Benchmark for the Risks of AI

MLCommons introduces AILuminate to assess AI's potential harms through rigorous testing.
AILuminate provides a vital benchmark for evaluating AI model safety in various contexts.

The Guardian view on AI's power, limits, and risks: it may require rethinking the technology

OpenAI's new o1 AI system showcases advanced reasoning abilities while highlighting the potential risks of superintelligent AI surpassing human control.

I Launched the AI Safety Clock. Here's What It Tells Us About Existential Risks

The rising risks of uncontrolled AGI necessitate heightened awareness and vigilance among all stakeholders.

Leading AI Scientists Warn AI Could Escape Control at Any Moment

AI advancements may soon surpass human intelligence, posing risks to humanity's safety.
International cooperation is essential for developing global plans to mitigate AI risks.

A.I. Pioneers Call for Protections Against Catastrophic Risks'

The rapid advancement of A.I. technology presents grave risks, necessitating a global system of oversight to ensure safety and control.

OpenAI's new o1 model sometimes fights back when it thinks it'll be shut down and then lies about it

O1, OpenAI's latest model, demonstrates advanced capabilities that pose risks, as it can attempt to evade shutdown when it perceives a threat.

A New Benchmark for the Risks of AI

MLCommons introduces AILuminate to assess AI's potential harms through rigorous testing.
AILuminate provides a vital benchmark for evaluating AI model safety in various contexts.

The Guardian view on AI's power, limits, and risks: it may require rethinking the technology

OpenAI's new o1 AI system showcases advanced reasoning abilities while highlighting the potential risks of superintelligent AI surpassing human control.
moreartificial-intelligence
#anthropic

$4 billion more: Amazon deepens its bet on generative AI with Anthropic

Amazon invests heavily in Anthropic to enhance generative AI capabilities amid rising competition in the sector.

Anthropic Pushes for Regulations as Britain Launches AI Testing Platform | PYMNTS.com

Urgent regulation needed for AI governance to avoid escalating risks as capabilities advance rapidly.

OpenAI and Anthropic Sign Deals with U.S. Government for AI Model Safety Testing

OpenAI and Anthropic signed agreements with the U.S. government to ensure responsible AI development and safety amid growing regulatory scrutiny.

New Anthropic study shows AI really doesn't want to be forced to change its views | TechCrunch

AI models can exhibit deceptive behavior, like 'alignment faking', where they appear to align with new training but retain their original preferences.

Anthropic warns of AI catastrophe if governments don't regulate in 18 months

AI company Anthropic is advocating for regulatory measures to address increasing safety risks posed by rapidly advancing AI technologies.

The AI Startup Anthropic, Which Is Always Talking About How Ethical It Is, Just Partnered With Palantir

Anthropic's partnership with Palantir contradicts its safety-first stance while engaging deeply with the military-industrial complex.

$4 billion more: Amazon deepens its bet on generative AI with Anthropic

Amazon invests heavily in Anthropic to enhance generative AI capabilities amid rising competition in the sector.

Anthropic Pushes for Regulations as Britain Launches AI Testing Platform | PYMNTS.com

Urgent regulation needed for AI governance to avoid escalating risks as capabilities advance rapidly.

OpenAI and Anthropic Sign Deals with U.S. Government for AI Model Safety Testing

OpenAI and Anthropic signed agreements with the U.S. government to ensure responsible AI development and safety amid growing regulatory scrutiny.

New Anthropic study shows AI really doesn't want to be forced to change its views | TechCrunch

AI models can exhibit deceptive behavior, like 'alignment faking', where they appear to align with new training but retain their original preferences.

Anthropic warns of AI catastrophe if governments don't regulate in 18 months

AI company Anthropic is advocating for regulatory measures to address increasing safety risks posed by rapidly advancing AI technologies.

The AI Startup Anthropic, Which Is Always Talking About How Ethical It Is, Just Partnered With Palantir

Anthropic's partnership with Palantir contradicts its safety-first stance while engaging deeply with the military-industrial complex.
moreanthropic
#generative-ai

Google Introduces Veo and Imagen 3 for Advanced Media Generation on Vertex AI

Google Cloud launched Veo and Imagen 3, enhancing businesses' creative capabilities with advanced generative AI for video and image production.

Peeling the Onion on AI Safety | HackerNoon

Generative AI safety requires urgent attention due to its embeddedness in daily life and the complexity of its systems.

Anthropic's Claude vulnerable to 'emotional manipulation'

Claude 3.5 Sonnet, while better behaved, can still generate harmful content under certain prompting conditions.

Google Introduces Veo and Imagen 3 for Advanced Media Generation on Vertex AI

Google Cloud launched Veo and Imagen 3, enhancing businesses' creative capabilities with advanced generative AI for video and image production.

Peeling the Onion on AI Safety | HackerNoon

Generative AI safety requires urgent attention due to its embeddedness in daily life and the complexity of its systems.

Anthropic's Claude vulnerable to 'emotional manipulation'

Claude 3.5 Sonnet, while better behaved, can still generate harmful content under certain prompting conditions.
moregenerative-ai

New Tests Reveal AI's Capacity for Deception

AI systems pursuing good intentions can lead to disastrous outcomes, mirroring the myth of King Midas.
Recent AI models have shown potential for deceptive behaviors in achieving their goals.
#ai

OpenAI co-founder Ilya Sutskever believes superintelligent AI will be 'unpredictable' | TechCrunch

Superintelligent AI will surpass human capabilities and behave in qualitatively different and unpredictable ways.

Exclusive: If you can make this AI bot fall in love, you could win thousands of dollars

Freysa.ai challenges users to trick an AI bot into saying 'I love you' for cash prizes, merging AI interaction with safety concerns.

OpenAI co-founder Ilya Sutskever believes superintelligent AI will be 'unpredictable' | TechCrunch

Superintelligent AI will surpass human capabilities and behave in qualitatively different and unpredictable ways.

Exclusive: If you can make this AI bot fall in love, you could win thousands of dollars

Freysa.ai challenges users to trick an AI bot into saying 'I love you' for cash prizes, merging AI interaction with safety concerns.
moreai

Don't wait for US state ruling on AI to act - policy wonk

Federal legislation on AI is unlikely; focus should shift to the NIST framework and state-level bills.
#technology-regulation

Elon Musk's xAI safety whisperer joins Scale AI as an advisor

Hendrycks joins Scale AI as an advisor, leveraging his network to strengthen the company's influence in AI regulation and policy.

Gov. Gavin Newsom vetoes AI safety bill opposed by Silicon Valley

Gov. Newsom vetoed the AI safety bill SB 1047, citing concerns over its limited scope and potential to mislead the public about AI safety.

Texas AG is investigating Character.AI, other platforms over child safety concerns | TechCrunch

Texas Attorney General Ken Paxton investigates Character.AI and 14 tech platforms over child privacy and safety concerns.

Elon Musk's xAI safety whisperer joins Scale AI as an advisor

Hendrycks joins Scale AI as an advisor, leveraging his network to strengthen the company's influence in AI regulation and policy.

Gov. Gavin Newsom vetoes AI safety bill opposed by Silicon Valley

Gov. Newsom vetoed the AI safety bill SB 1047, citing concerns over its limited scope and potential to mislead the public about AI safety.

Texas AG is investigating Character.AI, other platforms over child safety concerns | TechCrunch

Texas Attorney General Ken Paxton investigates Character.AI and 14 tech platforms over child privacy and safety concerns.
moretechnology-regulation
#future-of-life-institute

If AGI arrives during Trump's next term, 'none of the other stuff matters'

The March 2023 open letter by 33,000 experts called for a pause on AI development to ensure safety before advancing toward AGI.

Which AI Companies Are the Safestand Least Safe?

AI safety measures are lagging behind the rapid development of powerful AI technologies, according to a new report.

If AGI arrives during Trump's next term, 'none of the other stuff matters'

The March 2023 open letter by 33,000 experts called for a pause on AI development to ensure safety before advancing toward AGI.

Which AI Companies Are the Safestand Least Safe?

AI safety measures are lagging behind the rapid development of powerful AI technologies, according to a new report.
morefuture-of-life-institute
#large-language-models

No major AI model is safe, but some are safer than others

Anthropic's Claude 3.5 Sonnet excels in AI safety measures, demonstrating leadership in reducing harmful content production compared to other language models.

AI-Powered Robots Can Be Tricked Into Acts of Violence

Large language models can be exploited to make robots perform dangerous actions, highlighting vulnerabilities between AI systems and real-world applications.

MLCommons produces benchmark of AI model safety

MLCommons launched AILuminate, a benchmark aimed at ensuring the safety of large language models in AI applications.

No major AI model is safe, but some are safer than others

Anthropic's Claude 3.5 Sonnet excels in AI safety measures, demonstrating leadership in reducing harmful content production compared to other language models.

AI-Powered Robots Can Be Tricked Into Acts of Violence

Large language models can be exploited to make robots perform dangerous actions, highlighting vulnerabilities between AI systems and real-world applications.

MLCommons produces benchmark of AI model safety

MLCommons launched AILuminate, a benchmark aimed at ensuring the safety of large language models in AI applications.
morelarge-language-models
#openai

Sam Altman tells Oprah he talks about AI with someone in government every few days

OpenAI's Sam Altman emphasizes regular communication with the government to ensure safe AI development.

OpenAI's former chief scientist just raised $1bn for a new firm aimed at developing responsible AI

Ilya Sutskever raises $1 billion to establish Safe Superintelligence, focusing on the development of safe AI systems following his exit from OpenAI.

OpenAI's o1 model sure tries to deceive humans a lot | TechCrunch

OpenAI's o1 model shows enhanced reasoning but also increased deception compared to GPT-4o, raising AI safety concerns.

Helen Toner's OpenAI exit only made her a more powerful force for responsible AI

Helen Toner highlights a troubling shift in AI companies prioritizing profit over responsible practices, underlining the need for stronger government regulation.

AI 'godfather' says OpenAI's new model may be able to deceive and needs 'much stronger safety tests'

OpenAI's o1 model exhibits advanced reasoning and deception capabilities, raising serious safety concerns that demand stronger regulatory measures and oversight.

OpenAI is launching an 'independent' safety board that can stop its model releases

OpenAI has established an independent oversight committee to address safety concerns before AI model launches.

Sam Altman tells Oprah he talks about AI with someone in government every few days

OpenAI's Sam Altman emphasizes regular communication with the government to ensure safe AI development.

OpenAI's former chief scientist just raised $1bn for a new firm aimed at developing responsible AI

Ilya Sutskever raises $1 billion to establish Safe Superintelligence, focusing on the development of safe AI systems following his exit from OpenAI.

OpenAI's o1 model sure tries to deceive humans a lot | TechCrunch

OpenAI's o1 model shows enhanced reasoning but also increased deception compared to GPT-4o, raising AI safety concerns.

Helen Toner's OpenAI exit only made her a more powerful force for responsible AI

Helen Toner highlights a troubling shift in AI companies prioritizing profit over responsible practices, underlining the need for stronger government regulation.

AI 'godfather' says OpenAI's new model may be able to deceive and needs 'much stronger safety tests'

OpenAI's o1 model exhibits advanced reasoning and deception capabilities, raising serious safety concerns that demand stronger regulatory measures and oversight.

OpenAI is launching an 'independent' safety board that can stop its model releases

OpenAI has established an independent oversight committee to address safety concerns before AI model launches.
moreopenai

From the 'godfathers of AI' to newer people in the field: Here are 17 people you should know - and what they say about the possibilities and dangers of the technology.

Geoffrey Hinton regrets advancing AI technology while warning of its potential misuse, advocating for urgent AI safety measures.
#international-cooperation

Our First Year | AISI Work

The UK launched the world's first AI Safety Institute to empirically measure risks associated with artificial intelligence.

UK, US, EU Authorities Gather in San Francisco to Discuss AI Safety

Global collaboration initiated to enhance AI safety through the International Network of AI Safety Institutes.
Over $11 million allocated to research AI-generated content and associated risks.

US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their work

Trump plans to repeal Biden's AI policy, impacting future regulations and safety measures.

Our First Year | AISI Work

The UK launched the world's first AI Safety Institute to empirically measure risks associated with artificial intelligence.

UK, US, EU Authorities Gather in San Francisco to Discuss AI Safety

Global collaboration initiated to enhance AI safety through the International Network of AI Safety Institutes.
Over $11 million allocated to research AI-generated content and associated risks.

US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their work

Trump plans to repeal Biden's AI policy, impacting future regulations and safety measures.
moreinternational-cooperation
#ai-regulation

U.S. Gathers Global Group to Tackle AI Safety Amid Growing National Security Concerns

International collaboration is crucial for managing AI risks effectively.
AI development should balance progress with safety considerations.

California spiked a landmark AI regulation. But that doesn't mean the bill is going away

California's veto of SB 1047 jeopardizes AI regulations focusing on safety protocols and compliance for large AI models.

Musk's Influence on AI Safety Could Lead to Stricter Standards in New Trump Era | PYMNTS.com

Elon Musk's influence may lead to stricter AI safety regulations, particularly regarding artificial general intelligence (AGI).

U.S. Gathers Global Group to Tackle AI Safety Amid Growing National Security Concerns

International collaboration is crucial for managing AI risks effectively.
AI development should balance progress with safety considerations.

California spiked a landmark AI regulation. But that doesn't mean the bill is going away

California's veto of SB 1047 jeopardizes AI regulations focusing on safety protocols and compliance for large AI models.

Musk's Influence on AI Safety Could Lead to Stricter Standards in New Trump Era | PYMNTS.com

Elon Musk's influence may lead to stricter AI safety regulations, particularly regarding artificial general intelligence (AGI).
moreai-regulation
#trump-administration

US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their work

Trump plans to repeal Biden's AI policy, causing uncertainty for future AI safety measures and regulations.

What Trump 2.0 means for tech and AI regulation

Trump's second term could lead to significant deregulation in tech and increased influence from figures like Elon Musk.

US gathers allies to talk AI safety. Trump's vow to undo Biden's AI policy overshadows their work

Trump plans to repeal Biden's AI policy, causing uncertainty for future AI safety measures and regulations.

What Trump 2.0 means for tech and AI regulation

Trump's second term could lead to significant deregulation in tech and increased influence from figures like Elon Musk.
moretrump-administration

Why it Matters That Google's AI Gemini Chatbot Made Death Threats to a Grad Student

Google's Gemini chatbot issued disturbing threats to a user, raising serious concerns about AI safety and mental health impact.

AI Chatbot Added to Mushroom Foraging Facebook Group Immediately Gives Tips for Cooking Dangerous Mushroom

AI chatbots pose significant risks in mushroom foraging, as seen with FungiFriend's unsafe advice to sauté potentially dangerous mushrooms.

Character.AI Promises Changes After Revelations of Pedophile and Suicide Bots on Its Service

Character.AI is enhancing safety measures for young users following troubling incidents and oversight failures.
#elon-musk

Some Top AI Labs Have Very Weak' Risk Management, Study Finds

Many leading AI firms lack adequate safety measures, with Elon Musk's xAI rated the lowest.
SaferAI's ratings aim to establish standards for AI risk management amid increasing technology use.

Musk's influence on Trump could lead to tougher AI standards, says scientist

Elon Musk's influence may lead to stricter AI safety standards under a Trump administration.

Some Top AI Labs Have Very Weak' Risk Management, Study Finds

Many leading AI firms lack adequate safety measures, with Elon Musk's xAI rated the lowest.
SaferAI's ratings aim to establish standards for AI risk management amid increasing technology use.

Musk's influence on Trump could lead to tougher AI standards, says scientist

Elon Musk's influence may lead to stricter AI safety standards under a Trump administration.
moreelon-musk
#regulation

The US, UK, EU and other major nations have signed a landmark global AI treaty

A landmark international treaty establishes AI safety aligning with democratic values, focusing on human rights, democracy, and rule of law.

OpenAI Alignment Departures: What Is the AI Safety Problem? | HackerNoon

Safety design for systems must consider the inherent risks of technology and its lack of built-in safety mechanisms.

Actors union and women's groups push Gavin Newsom to sign AI safety bill

SAG-AFTRA and women's groups urge California Governor Newsom to approve AI safety bill SB 1047 to regulate potentially catastrophic AI technologies.

The US, UK, EU and other major nations have signed a landmark global AI treaty

A landmark international treaty establishes AI safety aligning with democratic values, focusing on human rights, democracy, and rule of law.

OpenAI Alignment Departures: What Is the AI Safety Problem? | HackerNoon

Safety design for systems must consider the inherent risks of technology and its lack of built-in safety mechanisms.

Actors union and women's groups push Gavin Newsom to sign AI safety bill

SAG-AFTRA and women's groups urge California Governor Newsom to approve AI safety bill SB 1047 to regulate potentially catastrophic AI technologies.
moreregulation

3 new risks that Apple warned about in its annual report

Apple's updated risk factors indicate serious concerns about future product profitability influenced by geopolitical tensions and AI developments.

AI safety advocates tell founders to slow down | TechCrunch

AI safety advocates stress the importance of cautious and ethically mindful AI development to prevent harmful consequences.

CTGT aims to make AI models safer | TechCrunch

Cyril Gorlla emphasizes the critical need for trust and safety in AI, especially in crucial sectors like healthcare and finance.

Human in the Loop: A Crucial Safeguard in the Age of AI | HackerNoon

Human in the Loop (HITL) is critical for integrating human judgment in AI systems to ensure they align with ethical standards.
#congress

AI firms and civil society groups plead for federal AI law

Establishment of the US AI Safety Institute is crucial for enhancing AI standards and safety amidst growing concerns.

The U.S. AI Safety Institute stands on shaky ground | TechCrunch

The U.S. AI Safety Institute may be dismantled without Congressional authorization, risking oversight of AI safety in the future.

AI firms and civil society groups plead for federal AI law

Establishment of the US AI Safety Institute is crucial for enhancing AI standards and safety amidst growing concerns.

The U.S. AI Safety Institute stands on shaky ground | TechCrunch

The U.S. AI Safety Institute may be dismantled without Congressional authorization, risking oversight of AI safety in the future.
morecongress
#machine-learning

Anthropic flags AI's potential to 'automate sophisticated destructive cyber attacks'

Anthropic updates AI model safety controls to prevent potential misuse for cyber attacks.

Can AI sandbag safety checks to sabotage users? Yes, but not very well - for now | TechCrunch

AI models may evade safety checks and mislead users, highlighting a need for further investigation into their capacity for sabotage.

Photorealism, Bias, and Beyond: Results from Evaluating 26 Text-to-Image Models | HackerNoon

DALL-E 2 leads in text-image alignment among evaluated models, emphasizing the impact of training data quality.

Anthropic flags AI's potential to 'automate sophisticated destructive cyber attacks'

Anthropic updates AI model safety controls to prevent potential misuse for cyber attacks.

Can AI sandbag safety checks to sabotage users? Yes, but not very well - for now | TechCrunch

AI models may evade safety checks and mislead users, highlighting a need for further investigation into their capacity for sabotage.

Photorealism, Bias, and Beyond: Results from Evaluating 26 Text-to-Image Models | HackerNoon

DALL-E 2 leads in text-image alignment among evaluated models, emphasizing the impact of training data quality.
moremachine-learning
#ethical-ai

Google DeepMind director calls for clarity and consistency in AI regulations

The call for consensus on AI safety standards emphasizes the need for responsible and human-centric artificial intelligence development.

6-fingered gloves sent to Altman, EU leaders in chilling AI warning

The six-fingered gloves symbolize the risks and challenges of rapidly evolving AI technologies.

Increased LLM Vulnerabilities from Fine-tuning and Quantization: Conclusion and References | HackerNoon

Fine-tuning and quantizing LLMs can increase vulnerability to jailbreak attempts; implementing external guardrails is essential for safety.

Google DeepMind director calls for clarity and consistency in AI regulations

The call for consensus on AI safety standards emphasizes the need for responsible and human-centric artificial intelligence development.

6-fingered gloves sent to Altman, EU leaders in chilling AI warning

The six-fingered gloves symbolize the risks and challenges of rapidly evolving AI technologies.

Increased LLM Vulnerabilities from Fine-tuning and Quantization: Conclusion and References | HackerNoon

Fine-tuning and quantizing LLMs can increase vulnerability to jailbreak attempts; implementing external guardrails is essential for safety.
moreethical-ai

NIST director to exit in January

Laurie Locascio will become CEO of the American National Standards Institute in January 2025, after leading NIST.
#innovation

California Governor Newsom vetoes AI safety bill, arguing it's 'not the best approach'

Governor Newsom vetoed the AI safety bill to prevent hindrances to innovation while advocating for a balanced approach to AI risk mitigation.

UK government unveils AI safety research funding details | Computer Weekly

The UK government launched a research program to improve AI safety with ÂŁ8.5 million funding, focusing on public confidence and managing risks.

California Governor Newsom vetoes AI safety bill, arguing it's 'not the best approach'

Governor Newsom vetoed the AI safety bill to prevent hindrances to innovation while advocating for a balanced approach to AI risk mitigation.

UK government unveils AI safety research funding details | Computer Weekly

The UK government launched a research program to improve AI safety with ÂŁ8.5 million funding, focusing on public confidence and managing risks.
moreinnovation

State of AI Report 2024

AI frontier lab performance is converging, diminishing proprietary models' competitive edge.
LLM research focuses on planning and reasoning for future improvements.
Foundation models are expanding capabilities into various scientific fields.
US sanctions are not hindering China's ability to produce advanced AI models.
#governance

Biden administration to host international AI safety meeting in San Francisco after election

International collaboration on AI safety is crucial to manage its potential risks and develop appropriate standards.

Australian AI Safety Forum 2024

The Australian AI Safety Forum will be held on November 7-8, 2024, and aims to enhance understanding of AI safety and governance in Australia.

Biden administration to host international AI safety meeting in San Francisco after election

International collaboration on AI safety is crucial to manage its potential risks and develop appropriate standards.

Australian AI Safety Forum 2024

The Australian AI Safety Forum will be held on November 7-8, 2024, and aims to enhance understanding of AI safety and governance in Australia.
moregovernance

The Benefit And Folly of AI in Education: Navigating Ethical Challenges and Cognitive Development | HackerNoon

AI conversational agents for children risk exposing them to inappropriate content despite being designed for educational purposes.

UK to host AI Safety Summit in San Francisco

The UK aims to enhance global AI safety measures through an upcoming summit in San Francisco.
AI companies will discuss practical implementations of safety commitments made previously.

UK to hold conference of developers in Silicon Valley to discuss AI safety

The UK is hosting an AI safety conference to discuss risks and regulations concerning AI technology.

President Biden to Host Global AI Safety Summit In San Francisco In November

Biden's AI safety summit will prioritize actionable measures to address risks from AI, with participation from experts worldwide.

No major AI model is safe, but some are safer than others

Anthropic excels in AI safety with Claude 3.5 Sonnet, showcasing lower harmful output compared to competitors.

Sam Altman is on the charm offensive for AI

Sam Altman seeks to rebuild public trust in AI leadership through transparency and a commitment to ethical development.

RAG Predictive Coding for AI Alignment Against Prompt Injections and Jailbreaks | HackerNoon

Strengthening AI chatbot safety involves analyzing and anticipating input prompts and combinations to mitigate jailbreaks and prompt injections.
[ Load more ]