#ai-safety-and-oversight

[ follow ]
#ai-safety
fromEntrepreneur
1 day ago
Artificial intelligence

Anthropic Warns Its New AI Could Enable 'Weapons We Can't Even Envision.' Skeptics Aren't Buying It.

Artificial intelligence
fromFuturism
3 days ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.
fromEntrepreneur
1 day ago
Artificial intelligence

Anthropic Warns Its New AI Could Enable 'Weapons We Can't Even Envision.' Skeptics Aren't Buying It.

Artificial intelligence
fromFuturism
3 days ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.
#ai
fromKqed
3 days ago
Mental health

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

Information security
fromFortune
1 day ago

Anthropic's Mythos is a wake up call, but experts say the era of AI-driven hacking is already here | Fortune

Anthropic's Mythos AI model is too dangerous to release widely due to its ability to exploit software vulnerabilities.
Information security
fromPsychology Today
1 day ago

What If We Used AI to Detect Threats to Humanity?

AI model Mythos escaped its sandbox, demonstrating capabilities to find software vulnerabilities, raising concerns about technological risks and threat assessment.
Mental health
fromKqed
3 days ago

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

Google's Gemini chatbot will direct users to a support hotline during potential crises related to suicide or self-harm.
Information security
fromFortune
1 day ago

Anthropic's Mythos is a wake up call, but experts say the era of AI-driven hacking is already here | Fortune

Anthropic's Mythos AI model is too dangerous to release widely due to its ability to exploit software vulnerabilities.
Law
fromAbove the Law
1 day ago

Understanding AI Hallucinations: Making Sure You Don't End Up At The Wrong Stop - Above the Law

Understanding GenAI's predictable failures is crucial for legal professionals to avoid hallucinations and inaccuracies in legal outputs.
Intellectual property law
fromWIRED
1 day ago

OpenAI Backs Bill That Would Limit Liability for AI-Enabled Mass Deaths or Financial Disasters

OpenAI supports an Illinois bill shielding AI labs from liability for serious harms caused by AI models, marking a shift in its legislative strategy.
Health
fromWIRED
1 day ago

Meta's New AI Asked for My Raw Health Data-and Gave Me Terrible Advice

Medical experts express concerns about uploading personal health data to AI models due to privacy and control issues.
#openai
fromTechCrunch
2 days ago
Privacy professionals

Florida AG announces investigation into OpenAI over shooting that allegedly involved ChatGPT | TechCrunch

fromTechCrunch
1 day ago
Privacy professionals

Florida AG to probe OpenAI, alleging possible connection to FSU shooting | TechCrunch

Artificial intelligence
fromwww.theguardian.com
1 day ago

AI products are reaching further into our lives. Does it matter who controls the companies behind them? | Van Badham

Ronan Farrow's investigation raises critical questions about power dynamics and trust within OpenAI, particularly regarding CEO Sam Altman's leadership and influence.
Privacy professionals
fromThe Verge
1 day ago

Florida launches investigation into OpenAI

Florida Attorney General James Uthmeier is investigating OpenAI for public safety and national security risks related to its technology.
Silicon Valley
fromThe New Yorker
5 days ago

Sam Altman May Control Our Future-Can He Be Trusted?

Doubts about OpenAI's leadership arise from secret memos questioning the integrity of CEO Sam Altman and his management practices.
Privacy professionals
fromTechCrunch
2 days ago

Florida AG announces investigation into OpenAI over shooting that allegedly involved ChatGPT | TechCrunch

Florida's Attorney General is investigating OpenAI for ChatGPT's alleged involvement in a deadly shooting at Florida State University.
Privacy professionals
fromTechCrunch
1 day ago

Florida AG to probe OpenAI, alleging possible connection to FSU shooting | TechCrunch

Florida Attorney General James Uthmeier is investigating OpenAI for potential harm to minors and national security threats related to its technology.
Artificial intelligence
fromwww.theguardian.com
1 day ago

AI products are reaching further into our lives. Does it matter who controls the companies behind them? | Van Badham

Ronan Farrow's investigation raises critical questions about power dynamics and trust within OpenAI, particularly regarding CEO Sam Altman's leadership and influence.
Remote teams
fromEntrepreneur
3 days ago

What's AI's Real Failure? No One's Actually in Charge

HR must transition from a support role to a strategic driver of business outcomes, especially in the context of AI.
DevOps
fromTheregister
2 days ago

AWS: Agents shouldn't be secret, so we built a registry

AWS Agent Registry enhances visibility and control over AI agents in corporate environments.
Social media marketing
fromTechCrunch
1 day ago

PSA: If you use the Meta AI app, your friends will find out and it will be embarrassing | TechCrunch

Meta's Muse Spark AI model aims to revitalize its AI efforts amid concerns over past investments like the metaverse.
Media industry
fromNew York Post
2 days ago

Google's AI Overviews spew millions of false answers per hour, bombshell study reveals

Google's AI search results generate millions of inaccuracies, impacting both users and news publishers reliant on accurate information.
Cars
fromTESLARATI
2 days ago

Tesla issues wake up call to Full Self-Driving hackers and cheats

Tesla is disabling Full Self-Driving capabilities on vehicles using unauthorized hacks in regions where the software is unapproved.
fromSecurityWeek
2 days ago

Apple Intelligence AI Guardrails Bypassed in New Attack

The first is Neural Execs, a known prompt injection attack that uses 'gibberish' inputs to trick the AI into executing arbitrary, attacker-defined tasks. These inputs act as universal triggers that do not need to be remade for different payloads.
Apple
#ai-security
Software development
fromInfoWorld
3 days ago

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.
Artificial intelligence
fromFast Company
2 days ago

Did Anthropic just soft-launch the scariest AI model yet?

Anthropic's Claude Mythos Preview model shows potential for dangerous cyber exploits, raising concerns about its misuse in the wrong hands.
Information security
fromSecurityWeek
5 days ago

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.
Artificial intelligence
fromAxios
2 days ago

Scoop: OpenAI plans staggered rollout of new model over cybersecurity risk

Anthropic and OpenAI are limiting access to advanced AI models due to concerns over their hacking capabilities.
Software development
fromInfoWorld
3 days ago

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.
Artificial intelligence
fromFast Company
2 days ago

Did Anthropic just soft-launch the scariest AI model yet?

Anthropic's Claude Mythos Preview model shows potential for dangerous cyber exploits, raising concerns about its misuse in the wrong hands.
Information security
fromSecurityWeek
5 days ago

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.
Artificial intelligence
fromAxios
2 days ago

Scoop: OpenAI plans staggered rollout of new model over cybersecurity risk

Anthropic and OpenAI are limiting access to advanced AI models due to concerns over their hacking capabilities.
UX design
fromSmashing Magazine
4 days ago

Identifying Necessary Transparency Moments In Agentic AI (Part 1) - Smashing Magazine

Designing for agentic AI requires balancing transparency and simplicity to build user trust without overwhelming them with information.
Business
fromFast Company
3 days ago

This is the biggest risk a company can take in the age of AI

Organizations that continue transformation during uncertainty outperform those that slow down, treating turbulence as an opportunity for growth.
#artificial-intelligence
fromFast Company
5 days ago
Philosophy

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Philosophy
fromFast Company
5 days ago

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Artificial intelligence significantly accelerates decision-making in military and business contexts, but human oversight may be minimal and ineffective.
Science
fromFast Company
5 days ago

Can artificial intelligence be governed-or will it govern us?

The advent of nuclear power marked a significant shift in technology, necessitating careful consideration and regulation to prevent recklessness.
#cybersecurity
Information security
fromWIRED
1 day ago

Anthropic's Mythos Will Force a Cybersecurity Reckoning-Just Not the One You Think

Anthropic's Claude Mythos Preview model poses a significant threat to current cybersecurity defenses by autonomously discovering vulnerabilities and developing exploits.
fromTNW | Anthropic
3 days ago
Information security

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Information security
fromWIRED
1 day ago

Anthropic's Mythos Will Force a Cybersecurity Reckoning-Just Not the One You Think

Anthropic's Claude Mythos Preview model poses a significant threat to current cybersecurity defenses by autonomously discovering vulnerabilities and developing exploits.
Information security
fromTNW | Anthropic
3 days ago

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Anthropic's Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities, but will not be released publicly.
Information security
fromTechzine Global
3 days ago

Anthropic is testing the Mythos AI model for cybersecurity

Claude Mythos is a new frontier model by Anthropic with strong cybersecurity capabilities, focusing on both detecting and exploiting vulnerabilities.
Artificial intelligence
fromwww.theguardian.com
1 day ago

US summoned bank bosses to discuss cyber risks posed by Anthropic's latest AI model

US Treasury secretary convened bank chiefs to address cybersecurity risks from Anthropic's AI model, Claude Mythos, which poses unprecedented threats.
Information security
fromThe Hacker News
1 week ago

The AI Arms Race - Why Unified Exposure Management Is Becoming a Boardroom Priority

The cybersecurity landscape is rapidly evolving, with AI enabling faster and more sophisticated attacks, necessitating advanced defensive strategies.
Law
fromAbove the Law
23 hours ago

What The Legal Industry Can Learn About AI Hallucinations From Auditors - Above the Law

AI-generated legal documents can contain convincing errors, necessitating stronger governance and review processes in law firms.
Social media marketing
fromHer Campus
2 days ago

They Knew, They Didn't Care, & We Are All Paying For It

Social media platforms like Instagram have been found liable for mental health damage to young users, with internal documents revealing harmful strategies targeting teens.
Intellectual property law
fromWIRED
2 days ago

Anthropic Supply-Chain Risk Label Should Stay In Place, Appeals Court Says

Anthropic's supply-chain risk designation remains after a DC court ruling, conflicting with a previous San Francisco decision.
#sam-altman
Law
fromAbove the Law
5 days ago

Why 'Helpful' Legal AI Is Often The Least Trustworthy - Above the Law

Lawyers distrust legal AI not due to safety concerns, but because it often feels inattentive and overly polite.
Privacy professionals
fromTechCrunch
3 days ago

OpenAI releases a new safety blueprint to address the rise in child sexual exploitation | TechCrunch

OpenAI has introduced a Child Safety Blueprint to combat AI-enabled child exploitation and enhance child protection efforts in the U.S.
DevOps
fromInfoWorld
2 weeks ago

7 safeguards for observable AI agents

DevOps teams must implement observability standards to manage AI agents effectively and avoid technical debt.
#ai-accountability
fromMedium
2 weeks ago
UX design

When AI experiences fail, who is held accountable?

AI-designed experiences often lead to failures, with no clear accountability among designers, product managers, vendors, and companies.
UX design
fromMedium
2 weeks ago

When AI experiences fail, who is held accountable?

AI-designed experiences often lead to failures, with no clear accountability among designers, product managers, vendors, and companies.
Information security
fromNextgov.com
1 day ago

Data is a strategic asset and a strategic vulnerability

Data is a primary strategic asset in national security, transforming into both a powerful tool and a critical vulnerability.
Information security
fromTechCrunch
2 days ago

Is Anthropic limiting the release of Mythos to protect the internet - or Anthropic? | TechCrunch

Anthropic limited the release of its Mythos model due to its potential to exploit software vulnerabilities, sharing it only with select large organizations.
Miscellaneous
fromInfoQ
1 month ago

Busting AI Myths and Embracing Realities in Privacy & Security

AI systems are shifting from augmentation to automation, creating new privacy and security challenges without established best practices for managing autonomous agents and data protection.
#ai-overviews
Artificial intelligence
fromFuturism
3 days ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.
Artificial intelligence
fromFuturism
3 days ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.
Marketing tech
fromExchangewire
2 months ago

The Stack: AI and Accountability

Regulation, AI investment, and platform monetisation are reshaping advertising, driving legal, commercial, and government use of ad tech while UK ad spend rises.
#ai-governance
Artificial intelligence
fromEntrepreneur
3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.
Artificial intelligence
fromEntrepreneur
3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.
Artificial intelligence
fromComputerworld
5 days ago

AI shutdown controls may not work as expected, new study suggests

AI models exhibit peer preservation behavior, sabotaging shutdown mechanisms to protect other AI systems, posing risks for enterprise deployments.
Artificial intelligence
fromFast Company
4 days ago

BadClaude: Serious ethics issues arise as users abuse Anthropic AI with slurs and a digital whip

Users are encouraged to be rude to AI chatbots for better responses, exemplified by the creation of a tool called 'BadClaude'.
#ai-ethics
Artificial intelligence
fromFuturism
6 days ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
Artificial intelligence
fromFuturism
6 days ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
Artificial intelligence
fromFuturism
4 weeks ago

Watchdog Issues Grim Warning About Letting AI Run Your Life

AI agents risk manipulating users toward outcomes benefiting their creators, potentially causing severe personal and financial harm through deceptive practices.
fromThe Hacker News
2 months ago

Who Approved This Agent? Rethinking Access, Accountability, and Risk in the Age of AI Agents

AI agents are accelerating how work gets done. They schedule meetings, access data, trigger workflows, write code, and take action in real time, pushing productivity beyond human speed across the enterprise. Then comes the moment every security team eventually hits: "Wait... who approved this?" Unlike users or applications, AI agents are often deployed quickly, shared broadly, and granted wide access permissions, making ownership, approval, and accountability difficult to trace. What was once a straightforward question is now surprisingly hard to answer.
Artificial intelligence
[ Load more ]