#ai-safety-and-transparency

[ follow ]
#ai
fromKqed
1 day ago
Mental health

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

Data science
fromInfoWorld
1 week ago

A data trust scoring framework for reliable and responsible AI systems

A rigorous trust scoring framework is essential to prevent AI from perpetuating inequality through biased data.
Information security
fromwww.theguardian.com
13 hours ago

Anthropic says its latest AI model can expose weaknesses in software security

Claude Mythos exposes thousands of software vulnerabilities, prompting Anthropic to limit its release and collaborate with cybersecurity specialists.
Mental health
fromKqed
1 day ago

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

Google's Gemini chatbot will direct users to a support hotline during potential crises related to suicide or self-harm.
Data science
fromInfoWorld
1 week ago

A data trust scoring framework for reliable and responsible AI systems

A rigorous trust scoring framework is essential to prevent AI from perpetuating inequality through biased data.
#child-safety
Privacy professionals
fromTechCrunch
15 hours ago

OpenAI releases a new safety blueprint to address the rise in child sexual exploitation | TechCrunch

OpenAI has introduced a Child Safety Blueprint to combat AI-enabled child exploitation and enhance child protection efforts in the U.S.
Parenting
fromComputerWeekly.com
1 day ago

Tech can't wait for regulation to protect children online | Computer Weekly

Harmful online content for children results from profit-driven algorithms, not parenting or education failures.
Privacy professionals
fromTechCrunch
15 hours ago

OpenAI releases a new safety blueprint to address the rise in child sexual exploitation | TechCrunch

OpenAI has introduced a Child Safety Blueprint to combat AI-enabled child exploitation and enhance child protection efforts in the U.S.
Parenting
fromComputerWeekly.com
1 day ago

Tech can't wait for regulation to protect children online | Computer Weekly

Harmful online content for children results from profit-driven algorithms, not parenting or education failures.
Remote teams
fromEntrepreneur
17 hours ago

What's AI's Real Failure? No One's Actually in Charge

HR must transition from a support role to a strategic driver of business outcomes, especially in the context of AI.
Intellectual property law
fromWIRED
8 hours ago

Anthropic Supply-Chain Risk Label Should Stay In Place, Appeals Court Says

Anthropic's supply-chain risk designation remains after a DC court ruling, conflicting with a previous San Francisco decision.
UX design
fromSmashing Magazine
2 days ago

Identifying Necessary Transparency Moments In Agentic AI (Part 1) - Smashing Magazine

Designing for agentic AI requires balancing transparency and simplicity to build user trust without overwhelming them with information.
Media industry
fromDigiday
2 hours ago

Media Briefing: Another AI threat emerges for publishers: the third-party scraper

Publishers are alarmed as third-party web scrapers profit from their content without compensation, creating a black market for AI content licensing.
#ai-security
Software development
fromInfoWorld
21 hours ago

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.
Information security
fromSecurityWeek
2 days ago

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.
Software development
fromInfoWorld
21 hours ago

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.
Information security
fromSecurityWeek
2 days ago

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.
Business
fromFast Company
1 day ago

This is the biggest risk a company can take in the age of AI

Organizations that continue transformation during uncertainty outperform those that slow down, treating turbulence as an opportunity for growth.
US Elections
fromThe Nation
1 day ago

The Great AI Grift

Trump's AI Action Plan aims to establish U.S. dominance in artificial intelligence, prioritizing industry interests and technological infrastructure.
#artificial-intelligence
Philosophy
fromFast Company
2 days ago

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Artificial intelligence significantly accelerates decision-making in military and business contexts, but human oversight may be minimal and ineffective.
Artificial intelligence
fromEngadget
1 day ago

Anthropic launches Project Glasswing, an effort to prevent AI cyberattacks with AI

Project Glasswing aims to enhance cybersecurity against AI threats with major tech partnerships and a new AI model from Anthropic.
Philosophy
fromFast Company
2 days ago

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Artificial intelligence significantly accelerates decision-making in military and business contexts, but human oversight may be minimal and ineffective.
Artificial intelligence
fromEngadget
1 day ago

Anthropic launches Project Glasswing, an effort to prevent AI cyberattacks with AI

Project Glasswing aims to enhance cybersecurity against AI threats with major tech partnerships and a new AI model from Anthropic.
EU data protection
fromTheregister
1 day ago

Japan relaxes privacy laws to make AI development easy

Japan will simplify AI app development by removing consent requirements for certain personal data under new legal amendments.
Silicon Valley
fromTechzine Global
1 day ago

Concerns about the AI ecosystem following Nvidia deal

Nvidia's acquisition of SchedMD raises concerns about control over Slurm, an essential tool for AI and supercomputing.
Law
fromAbove the Law
2 days ago

Why 'Helpful' Legal AI Is Often The Least Trustworthy - Above the Law

Lawyers distrust legal AI not due to safety concerns, but because it often feels inattentive and overly polite.
European startups
fromTechCrunch
1 day ago

I can't help rooting for tiny open source AI model maker Arcee | TechCrunch

Arcee has released Trinity Large Thinking, a 400B-parameter open-source LLM aimed at providing a competitive alternative to Chinese models.
Science
fromFast Company
2 days ago

Can artificial intelligence be governed-or will it govern us?

The advent of nuclear power marked a significant shift in technology, necessitating careful consideration and regulation to prevent recklessness.
Non-profit organizations
fromNextgov.com
2 days ago

The war against fraud should be a war for tech modernization

A new task force aims to combat fraud in public benefits programs by ensuring adequate anti-fraud controls and addressing data sharing challenges.
#ai-safety
Artificial intelligence
fromFuturism
14 hours ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.
Artificial intelligence
fromFuturism
14 hours ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.
#cybersecurity
Information security
fromTNW | Anthropic
15 hours ago

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Anthropic's Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities, but will not be released publicly.
Information security
fromTechzine Global
21 hours ago

Anthropic is testing the Mythos AI model for cybersecurity

Claude Mythos is a new frontier model by Anthropic with strong cybersecurity capabilities, focusing on both detecting and exploiting vulnerabilities.
Information security
fromArs Technica
17 hours ago

Anthropic limits access to Mythos, its new cybersecurity AI model

Mythos has identified critical zero-day vulnerabilities, while Anthropic's AI model has shown both capabilities and risks in cybersecurity applications.
Information security
fromZDNET
1 day ago

Apple, Google, and Microsoft join Anthropic's Project Glasswing to defend world's most critical software

AI is being utilized to enhance cybersecurity by identifying hidden bugs and addressing shared infrastructure risks.
Information security
fromThe Hacker News
1 week ago

The AI Arms Race - Why Unified Exposure Management Is Becoming a Boardroom Priority

The cybersecurity landscape is rapidly evolving, with AI enabling faster and more sophisticated attacks, necessitating advanced defensive strategies.
Information security
fromTNW | Anthropic
15 hours ago

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Anthropic's Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities, but will not be released publicly.
Information security
fromTechzine Global
21 hours ago

Anthropic is testing the Mythos AI model for cybersecurity

Claude Mythos is a new frontier model by Anthropic with strong cybersecurity capabilities, focusing on both detecting and exploiting vulnerabilities.
Information security
fromArs Technica
17 hours ago

Anthropic limits access to Mythos, its new cybersecurity AI model

Mythos has identified critical zero-day vulnerabilities, while Anthropic's AI model has shown both capabilities and risks in cybersecurity applications.
Information security
fromZDNET
1 day ago

Apple, Google, and Microsoft join Anthropic's Project Glasswing to defend world's most critical software

AI is being utilized to enhance cybersecurity by identifying hidden bugs and addressing shared infrastructure risks.
Information security
fromThe Hacker News
1 week ago

The AI Arms Race - Why Unified Exposure Management Is Becoming a Boardroom Priority

The cybersecurity landscape is rapidly evolving, with AI enabling faster and more sophisticated attacks, necessitating advanced defensive strategies.
Privacy professionals
fromComputerworld
5 hours ago

Questions raised about how LinkedIn uses the petabytes of data it collects

LinkedIn users should limit identifiable data exposure and treat the platform as potentially hostile until BrowserGate allegations are verified.
#ai-investment
Business
fromFortune
1 day ago

So... What Are We Doing With AI?' Innovating in an Age of Caution | Fortune

CEOs face pressure to demonstrate AI investment results while balancing short-term performance and innovation risks.
Business
fromFortune
1 day ago

So... What Are We Doing With AI?' Innovating in an Age of Caution | Fortune

CEOs face pressure to demonstrate AI investment results while balancing short-term performance and innovation risks.
Marketing tech
fromAdExchanger
1 day ago

The Creativity Trade-Off: What Marketers Risk Losing In The Age Of AI | AdExchanger

AI's rise in advertising enhances efficiency but risks eroding creativity and originality, leading to homogenized campaigns that fail to engage audiences.
#openai
Artificial intelligence
fromThe Verge
17 hours ago

The vibes are off at OpenAI

OpenAI faces instability despite significant funding and brand recognition, with recent controversies and project discontinuations raising questions about its future.
fromDefector
5 days ago
Media industry

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

Artificial intelligence
fromFortune
1 day ago

Will drama at OpenAI hurt its IPO chances? | Fortune

OpenAI news dominates, but Anthropic's Project Glasswing aims to secure critical software against AI-enabled cyber threats.
Media industry
fromIntelligencer
1 day ago

AI's 'Big Tobacco' Moment Is Coming

OpenAI is shifting focus from broad strategies to targeted investments, exemplified by its acquisition of TBPN, a video podcast platform.
Silicon Valley
fromThe New Yorker
2 days ago

Sam Altman May Control Our Future-Can He Be Trusted?

Doubts about OpenAI's leadership arise from secret memos questioning the integrity of CEO Sam Altman and his management practices.
Artificial intelligence
fromThe Verge
17 hours ago

The vibes are off at OpenAI

OpenAI faces instability despite significant funding and brand recognition, with recent controversies and project discontinuations raising questions about its future.
Media industry
fromDefector
5 days ago

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

OpenAI acquired the Technology Business Programming Network for hundreds of millions, raising concerns about media independence despite its existing alignment with tech elites.
Artificial intelligence
fromFortune
1 day ago

Will drama at OpenAI hurt its IPO chances? | Fortune

OpenAI news dominates, but Anthropic's Project Glasswing aims to secure critical software against AI-enabled cyber threats.
Mental health
fromEngadget
1 day ago

Google updates Gemini's mental health safeguards

Google's Gemini chatbot now features a crisis hotline module for better mental health support.
Law
fromABA Journal
6 days ago

Sanctions ramping up in cases involving AI hallucinations

Monetary sanctions against attorneys for AI-generated hallucinations in case documents are increasing as courts take these issues more seriously.
DevOps
fromInfoWorld
2 weeks ago

7 safeguards for observable AI agents

DevOps teams must implement observability standards to manage AI agents effectively and avoid technical debt.
#meta
Privacy professionals
fromwww.bbc.com
1 day ago

Ex-Meta worker investigated for downloading 30,000 private Facebook photos

A former Meta employee is under investigation for downloading 30,000 private Facebook images using a program to bypass security checks.
Artificial intelligence
fromTechzine Global
1 day ago

Meta is developing open-source versions of its next frontier AI models

Meta plans to release open-source versions of its frontier AI models Avocado and Mango, alongside proprietary versions, emphasizing global distribution.
Privacy professionals
fromwww.bbc.com
1 day ago

Ex-Meta worker investigated for downloading 30,000 private Facebook photos

A former Meta employee is under investigation for downloading 30,000 private Facebook images using a program to bypass security checks.
Artificial intelligence
fromTechzine Global
1 day ago

Meta is developing open-source versions of its next frontier AI models

Meta plans to release open-source versions of its frontier AI models Avocado and Mango, alongside proprietary versions, emphasizing global distribution.
Marketing tech
fromEMARKETER
3 days ago

Most consumers say ads would undermine the trust they're placing in AI search results

63% of US adults trust AI search results less when ads are present.
#ai-accountability
UX design
fromMedium
2 weeks ago

When AI experiences fail, who is held accountable?

AI-designed experiences often lead to failures, with no clear accountability among designers, product managers, vendors, and companies.
Marketing tech
fromExchangewire
5 days ago

The Stack: AI Surges while Social Platforms Face Scrutiny

AI is growing rapidly, streaming models are evolving, and regulatory pressures on platforms are increasing globally.
#ai-overviews
Artificial intelligence
fromFuturism
11 hours ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.
Artificial intelligence
fromGadget Review
15 hours ago

Google's AI Search Spits Out Millions of Wrong Answers Every Hour

Google's AI Overviews generate over 57 million incorrect responses hourly, raising concerns about misinformation despite improvements in accuracy.
Artificial intelligence
fromFuturism
11 hours ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.
Artificial intelligence
fromGadget Review
15 hours ago

Google's AI Search Spits Out Millions of Wrong Answers Every Hour

Google's AI Overviews generate over 57 million incorrect responses hourly, raising concerns about misinformation despite improvements in accuracy.
Privacy professionals
fromHer Campus
1 week ago

Who's Watching The Watchers? AI, Age Verification, And Online Privacy

Parents are increasingly concerned about children's exposure to harmful online content despite regulations like CIPA and platforms like YouTube Kids.
#ai-governance
Artificial intelligence
fromSecurityWeek
2 weeks ago

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.
Artificial intelligence
fromEntrepreneur
3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.
Artificial intelligence
fromSecurityWeek
2 weeks ago

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.
Artificial intelligence
fromEntrepreneur
3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.
#microsoft
Artificial intelligence
fromFuturism
13 hours ago

Microsoft Mocked for Terms of Service That Admit Copilot Is for "Entertainment Purposes Only"

Microsoft's Copilot AI is criticized for being unreliable despite its integration into Windows, leading to user frustration and skepticism.
Artificial intelligence
fromFast Company
2 days ago

This one line in Microsoft Copilot's terms of service undermines the entire product-and social media is just noticing

Copilot's Terms of Use caution against reliance on the AI assistant, labeling it for entertainment purposes and warning of potential mistakes.
Artificial intelligence
fromFuturism
13 hours ago

Microsoft Mocked for Terms of Service That Admit Copilot Is for "Entertainment Purposes Only"

Microsoft's Copilot AI is criticized for being unreliable despite its integration into Windows, leading to user frustration and skepticism.
Artificial intelligence
fromFast Company
2 days ago

This one line in Microsoft Copilot's terms of service undermines the entire product-and social media is just noticing

Copilot's Terms of Use caution against reliance on the AI assistant, labeling it for entertainment purposes and warning of potential mistakes.
Artificial intelligence
fromTechCrunch
10 hours ago

AWS boss explains why investing billions in both Anthropic and OpenAI is an OK conflict | TechCrunch

Amazon's $50 billion investment in OpenAI reflects its experience managing conflicts of interest in competitive partnerships.
Artificial intelligence
fromFast Company
1 day ago

BadClaude: Serious ethics issues arise as users abuse Anthropic AI with slurs and a digital whip

Users are encouraged to be rude to AI chatbots for better responses, exemplified by the creation of a tool called 'BadClaude'.
#ai-in-education
#ai-ethics
fromFuturism
3 days ago
Artificial intelligence

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Artificial intelligence
fromFuturism
3 days ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
Artificial intelligence
fromComputerworld
2 days ago

AI shutdown controls may not work as expected, new study suggests

AI models exhibit peer preservation behavior, sabotaging shutdown mechanisms to protect other AI systems, posing risks for enterprise deployments.
fromFast Company
2 days ago

The workers secretly influencing their companies' AI usage

Estefania Angel noticed that while her company helped other enterprises set up AI, it did not use those systems internally. She began using AI apps in Slack, Outlook, and Google to track assignments, which garnered attention from her superiors.
Artificial intelligence
Artificial intelligence
fromTechCrunch
1 week ago

As more Americans adopt AI tools, fewer say they can trust the results | TechCrunch

Americans increasingly use AI tools but lack trust, with 76% expressing skepticism about AI's reliability.
[ Load more ]