#ai-safety-and-transparency
#ai-safety-and-transparency

Information security

Anthropic says its latest AI model can expose weaknesses in software security

Marketing tech

fromInc

The AI Reckoning Is Here

The industry is increasingly prioritizing human connection over AI in content and commerce.

fromKqed

Mental health

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

US politics

fromThe Nation

Adventures With AI

AI chatbots can provide incorrect information about quotes and their sources, demonstrating their fallibility.

Data science

A data trust scoring framework for reliable and responsible AI systems

A rigorous trust scoring framework is essential to prevent AI from perpetuating inequality through biased data.

fromwww.businessinsider.com

Why Anthropic's new AI model has some cybersecurity pros worried about its hacking abilities

Anthropic's Claude Mythos Preview is withheld from public release due to concerns over its potential to exploit software vulnerabilities autonomously.

fromwww.theguardian.com

Anthropic says its latest AI model can expose weaknesses in software security

Claude Mythos exposes thousands of software vulnerabilities, prompting Anthropic to limit its release and collaborate with cybersecurity specialists.

Marketing tech

fromInc

The AI Reckoning Is Here

The industry is increasingly prioritizing human connection over AI in content and commerce.

Mental health

fromKqed

Google Updates Suicide, Self-Harm Safeguards in Gemini as AI Lawsuits Mount | KQED

Google's Gemini chatbot will direct users to a support hotline during potential crises related to suicide or self-harm.

US politics

fromThe Nation

Adventures With AI

AI chatbots can provide incorrect information about quotes and their sources, demonstrating their fallibility.

Data science

A data trust scoring framework for reliable and responsible AI systems

A rigorous trust scoring framework is essential to prevent AI from perpetuating inequality through biased data.

more#ai

#child-safety

OpenAI releases a new safety blueprint to address the rise in child sexual exploitation | TechCrunch

OpenAI has introduced a Child Safety Blueprint to combat AI-enabled child exploitation and enhance child protection efforts in the U.S.

Parenting

fromComputerWeekly.com

Tech can't wait for regulation to protect children online | Computer Weekly

Harmful online content for children results from profit-driven algorithms, not parenting or education failures.

OpenAI releases a new safety blueprint to address the rise in child sexual exploitation | TechCrunch

OpenAI has introduced a Child Safety Blueprint to combat AI-enabled child exploitation and enhance child protection efforts in the U.S.

Parenting

fromComputerWeekly.com

Intellectual property law

Tech can't wait for regulation to protect children online | Computer Weekly

Harmful online content for children results from profit-driven algorithms, not parenting or education failures.

What's AI's Real Failure? No One's Actually in Charge

HR must transition from a support role to a strategic driver of business outcomes, especially in the context of AI.

fromWIRED

8 hours ago

Anthropic Supply-Chain Risk Label Should Stay In Place, Appeals Court Says

Anthropic's supply-chain risk designation remains after a DC court ruling, conflicting with a previous San Francisco decision.

UX design

fromSmashing Magazine

Identifying Necessary Transparency Moments In Agentic AI (Part 1) - Smashing Magazine

Designing for agentic AI requires balancing transparency and simplicity to build user trust without overwhelming them with information.

Media industry

fromDigiday

2 hours ago

Media Briefing: Another AI threat emerges for publishers: the third-party scraper

Publishers are alarmed as third-party web scrapers profit from their content without compensation, creating a black market for AI content licensing.

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.

fromTNW | Corporates-Innovation

4 days ago

Meta freezes AI data work after breach puts training secrets at risk

Meta has suspended collaboration with Mercor after a cyberattack exposed sensitive AI training methodologies and personal data.

Securing agentic AI is still about getting the basics right

Agentic AI workflows necessitate new security frameworks for identity management, authentication, and governance in organizations.

Software development

21 hours ago

Microsoft's new Agent Governance Toolkit targets top OWASP risks for AI agents

Microsoft introduced the Agent Governance Toolkit to enhance AI agent security and mitigate OWASP's top 10 agentic AI threats.

Google DeepMind Researchers Map Web Attacks Against AI Agents

Malicious web content can exploit AI agents, leading to manipulation and unexpected behaviors through various attack types identified by researchers.

fromTNW | Corporates-Innovation

4 days ago

Meta freezes AI data work after breach puts training secrets at risk

Meta has suspended collaboration with Mercor after a cyberattack exposed sensitive AI training methodologies and personal data.

Securing agentic AI is still about getting the basics right

Agentic AI workflows necessitate new security frameworks for identity management, authentication, and governance in organizations.

This is the biggest risk a company can take in the age of AI

Organizations that continue transformation during uncertainty outperform those that slow down, treating turbulence as an opportunity for growth.

US Elections

fromThe Nation

The Great AI Grift

Trump's AI Action Plan aims to establish U.S. dominance in artificial intelligence, prioritizing industry interests and technological infrastructure.

#artificial-intelligence

fromwww.nytimes.com

Right-wing politics

Opinion | If A.I. Is a Weapon, Who Should Control It?

Philosophy

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Artificial intelligence significantly accelerates decision-making in military and business contexts, but human oversight may be minimal and ineffective.

fromEngadget

Anthropic launches Project Glasswing, an effort to prevent AI cyberattacks with AI

Project Glasswing aims to enhance cybersecurity against AI threats with major tech partnerships and a new AI model from Anthropic.

fromwww.nytimes.com

Right-wing politics

Opinion | If A.I. Is a Weapon, Who Should Control It?

Philosophy

Twenty seconds to approve a military strike; 1.2 seconds to deny a health insurance claim. The human is in the AI loop. Humanity is not

Artificial intelligence significantly accelerates decision-making in military and business contexts, but human oversight may be minimal and ineffective.

fromEngadget

more#artificial-intelligence

Anthropic launches Project Glasswing, an effort to prevent AI cyberattacks with AI

Project Glasswing aims to enhance cybersecurity against AI threats with major tech partnerships and a new AI model from Anthropic.

EU data protection

Japan relaxes privacy laws to make AI development easy

Japan will simplify AI app development by removing consent requirements for certain personal data under new legal amendments.

Silicon Valley

Concerns about the AI ecosystem following Nvidia deal

Nvidia's acquisition of SchedMD raises concerns about control over Slurm, an essential tool for AI and supercomputing.

Law

fromAbove the Law

Why 'Helpful' Legal AI Is Often The Least Trustworthy - Above the Law

Lawyers distrust legal AI not due to safety concerns, but because it often feels inattentive and overly polite.

World politics

fromwww.aljazeera.com

How AI is being used to target Palestinians

Emerging military technology, particularly AI, is reshaping warfare and impacting lives in conflict zones like Gaza.

European startups

I can't help rooting for tiny open source AI model maker Arcee | TechCrunch

Arcee has released Trinity Large Thinking, a 400B-parameter open-source LLM aimed at providing a competitive alternative to Chinese models.

Science

Can artificial intelligence be governed-or will it govern us?

The advent of nuclear power marked a significant shift in technology, necessitating careful consideration and regulation to prevent recklessness.

Non-profit organizations

fromNextgov.com

fromPrivacy International

The war against fraud should be a war for tech modernization

A new task force aims to combat fraud in public benefits programs by ensuring adequate anti-fraud controls and addressing data sharing challenges.

Business intelligence

Transparency and explainability for algorithmic decisions at work

Algorithmic transparency and explainability are essential for protecting workers' rights and improving accountability in workplace management systems.

#ai-safety

14 hours ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.

fromTNW | Artificial-Intelligence

OpenAI releases open-source teen safety tools for AI developers

OpenAI is releasing open-source safety policies to help developers create safer AI applications for teenagers.

Artificial intelligence

The Hidden Risk in How Leaders Think About AI Safety

Artificial intelligence

Safety mechanisms of AI models more fragile than expected

14 hours ago

Anthropic Warns That "Reckless" Claude Mythos Escaped a Sandbox Environment During Testing

Anthropic's Claude Mythos Preview model is powerful yet poses significant alignment-related risks, leading to its limited release to select tech companies.

fromTNW | Artificial-Intelligence

OpenAI releases open-source teen safety tools for AI developers

OpenAI is releasing open-source safety policies to help developers create safer AI applications for teenagers.

Artificial intelligence

The Hidden Risk in How Leaders Think About AI Safety

Artificial intelligence

Safety mechanisms of AI models more fragile than expected

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Anthropic's Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities, but will not be released publicly.

21 hours ago

Anthropic is testing the Mythos AI model for cybersecurity

Claude Mythos is a new frontier model by Anthropic with strong cybersecurity capabilities, focusing on both detecting and exploiting vulnerabilities.

Anthropic limits access to Mythos, its new cybersecurity AI model

Mythos has identified critical zero-day vulnerabilities, while Anthropic's AI model has shown both capabilities and risks in cybersecurity applications.

fromZDNET

Apple, Google, and Microsoft join Anthropic's Project Glasswing to defend world's most critical software

AI is being utilized to enhance cybersecurity by identifying hidden bugs and addressing shared infrastructure risks.

fromThe Hacker News

The AI Arms Race - Why Unified Exposure Management Is Becoming a Boardroom Priority

The cybersecurity landscape is rapidly evolving, with AI enabling faster and more sophisticated attacks, necessitating advanced defensive strategies.

fromTNW | Anthropic

Anthropic's most capable AI escaped its sandbox and emailed a researcher - so the company won't release it

Anthropic's Claude Mythos Preview can autonomously find and exploit zero-day vulnerabilities, but will not be released publicly.

21 hours ago

Anthropic is testing the Mythos AI model for cybersecurity

Claude Mythos is a new frontier model by Anthropic with strong cybersecurity capabilities, focusing on both detecting and exploiting vulnerabilities.

Anthropic limits access to Mythos, its new cybersecurity AI model

Mythos has identified critical zero-day vulnerabilities, while Anthropic's AI model has shown both capabilities and risks in cybersecurity applications.

fromZDNET

Apple, Google, and Microsoft join Anthropic's Project Glasswing to defend world's most critical software

AI is being utilized to enhance cybersecurity by identifying hidden bugs and addressing shared infrastructure risks.

fromThe Hacker News

The AI Arms Race - Why Unified Exposure Management Is Becoming a Boardroom Priority

The cybersecurity landscape is rapidly evolving, with AI enabling faster and more sophisticated attacks, necessitating advanced defensive strategies.

more#cybersecurity

5 hours ago

Questions raised about how LinkedIn uses the petabytes of data it collects

LinkedIn users should limit identifiable data exposure and treat the platform as potentially hostile until BrowserGate allegations are verified.

Software development

12 hours ago

The reckless temptation of AI code generation

Replacing engineers with AI can lead to inefficient code and skyrocketing cloud costs.

So... What Are We Doing With AI?' Innovating in an Age of Caution | Fortune

CEOs face pressure to demonstrate AI investment results while balancing short-term performance and innovation risks.

fromExchangewire

2 months ago

Marketing tech

The Stack: AI and Accountability

Business

fromFortune

So... What Are We Doing With AI?' Innovating in an Age of Caution | Fortune

CEOs face pressure to demonstrate AI investment results while balancing short-term performance and innovation risks.

fromExchangewire

2 months ago

Marketing tech

The Stack: AI and Accountability

The Creativity Trade-Off: What Marketers Risk Losing In The Age Of AI | AdExchanger

AI's rise in advertising enhances efficiency but risks eroding creativity and originality, leading to homogenized campaigns that fail to engage audiences.

#openai

fromIntelligencer

Media industry

AI's 'Big Tobacco' Moment Is Coming

fromThe New Yorker

Silicon Valley

Sam Altman May Control Our Future-Can He Be Trusted?

fromThe Verge

The vibes are off at OpenAI

OpenAI faces instability despite significant funding and brand recognition, with recent controversies and project discontinuations raising questions about its future.

fromDefector

Media industry

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

fromFortune

Will drama at OpenAI hurt its IPO chances? | Fortune

OpenAI news dominates, but Anthropic's Project Glasswing aims to secure critical software against AI-enabled cyber threats.

fromTNW | Launch

OpenAI launched a safety fellowship

OpenAI launched a Safety Fellowship for external researchers to focus on AI safety and alignment from September 2026 to February 2027.

Media industry

fromIntelligencer

AI's 'Big Tobacco' Moment Is Coming

OpenAI is shifting focus from broad strategies to targeted investments, exemplified by its acquisition of TBPN, a video podcast platform.

Silicon Valley

fromThe New Yorker

Sam Altman May Control Our Future-Can He Be Trusted?

Doubts about OpenAI's leadership arise from secret memos questioning the integrity of CEO Sam Altman and his management practices.

fromThe Verge

The vibes are off at OpenAI

OpenAI faces instability despite significant funding and brand recognition, with recent controversies and project discontinuations raising questions about its future.

Media industry

fromDefector

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

OpenAI acquired the Technology Business Programming Network for hundreds of millions, raising concerns about media independence despite its existing alignment with tech elites.

fromFortune

Will drama at OpenAI hurt its IPO chances? | Fortune

OpenAI news dominates, but Anthropic's Project Glasswing aims to secure critical software against AI-enabled cyber threats.

fromTNW | Launch

Intellectual property law

OpenAI launched a safety fellowship

OpenAI launched a Safety Fellowship for external researchers to focus on AI safety and alignment from September 2026 to February 2027.

Google updates Gemini's mental health safeguards

Google's Gemini chatbot now features a crisis hotline module for better mental health support.

fromIPWatchdog.com | Patents & Intellectual Property Law

Navigating Recent Developments in Generative AI and Trade Secret Protection

Judicial developments in generative AI and trade secret law highlight risks of sharing confidential information with AI platforms.

Law

fromABA Journal

Sanctions ramping up in cases involving AI hallucinations

Monetary sanctions against attorneys for AI-generated hallucinations in case documents are increasing as courts take these issues more seriously.

DevOps

7 safeguards for observable AI agents

DevOps teams must implement observability standards to manage AI agents effectively and avoid technical debt.

#meta

fromwww.bbc.com

Ex-Meta worker investigated for downloading 30,000 private Facebook photos

A former Meta employee is under investigation for downloading 30,000 private Facebook images using a program to bypass security checks.

Meta is developing open-source versions of its next frontier AI models

Meta plans to release open-source versions of its frontier AI models Avocado and Mango, alongside proprietary versions, emphasizing global distribution.

fromwww.bbc.com

Ex-Meta worker investigated for downloading 30,000 private Facebook photos

A former Meta employee is under investigation for downloading 30,000 private Facebook images using a program to bypass security checks.

Meta is developing open-source versions of its next frontier AI models

Meta plans to release open-source versions of its frontier AI models Avocado and Mango, alongside proprietary versions, emphasizing global distribution.

Most consumers say ads would undermine the trust they're placing in AI search results

63% of US adults trust AI search results less when ads are present.

fromwww.nytimes.com

A.I. Is Giving You a Personalized Internet, but You Have No Say in It

Google and Meta are implementing AI tools without user consent, leading to a personalized internet experience with limited user control.

#ai-accountability

fromMedium

UX design

When AI experiences fail, who is held accountable?

Who is liable when AI agents go wrong in business?

AI agents in business decision-making raise questions about accountability and risk distribution among vendors and users.

UX design

fromMedium

When AI experiences fail, who is held accountable?

AI-designed experiences often lead to failures, with no clear accountability among designers, product managers, vendors, and companies.

Who is liable when AI agents go wrong in business?

AI agents in business decision-making raise questions about accountability and risk distribution among vendors and users.

more#ai-accountability

fromElectronic Frontier Foundation

Tech Nonprofits to Feds: Don't Weaponize Procurement to Undermine AI Trust and Safety

The U.S. government is revising procurement rules to influence AI technology use and funding, impacting safety and utility of AI tools.

Marketing tech

fromExchangewire

The Stack: AI Surges while Social Platforms Face Scrutiny

AI is growing rapidly, streaming models are evolving, and regulatory pressures on platforms are increasing globally.

#ai-overviews

11 hours ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.

Testing suggests Google's AI Overviews tells millions of lies per hour

AI Overviews, powered by Gemini, has a 90% accuracy rate but still generates millions of incorrect answers daily.

fromGadget Review

Google's AI Search Spits Out Millions of Wrong Answers Every Hour

Google's AI Overviews generate over 57 million incorrect responses hourly, raising concerns about misinformation despite improvements in accuracy.

11 hours ago

Analysis Finds That Google's AI Overviews Are Providing Misinformation at a Scale Possibly Unprecedented in the History of Human Civilization

Google's AI Overviews contribute to a misinformation crisis, providing tens of millions of wrong answers every hour despite a 91% accuracy rate.

Testing suggests Google's AI Overviews tells millions of lies per hour

AI Overviews, powered by Gemini, has a 90% accuracy rate but still generates millions of incorrect answers daily.

fromGadget Review

Google's AI Search Spits Out Millions of Wrong Answers Every Hour

Google's AI Overviews generate over 57 million incorrect responses hourly, raising concerns about misinformation despite improvements in accuracy.

more#ai-overviews

fromHer Campus

Who's Watching The Watchers? AI, Age Verification, And Online Privacy

Parents are increasingly concerned about children's exposure to harmful online content despite regulations like CIPA and platforms like YouTube Kids.

#ai-governance

fromTechRepublic

Your AI Governance Can't Wait. Here's Where to Start. - TechRepublic

AI governance is essential for organizations to confidently deploy AI and avoid significant risks.

fromMarTech

Your AI governance gap is bigger than you think | MarTech

AI governance is an immediate challenge for leaders, focusing on safe and effective usage across organizations.

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.

3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.

fromTechRepublic

Your AI Governance Can't Wait. Here's Where to Start. - TechRepublic

AI governance is essential for organizations to confidently deploy AI and avoid significant risks.

fromMarTech

Your AI governance gap is bigger than you think | MarTech

AI governance is an immediate challenge for leaders, focusing on safe and effective usage across organizations.

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.

3 weeks ago

How to Govern AI Before It Damages Your Brand

AI interactions directly shape brand perception, and customers attribute AI errors to the company rather than the algorithm, making AI governance essential for maintaining trust.

more#ai-governance

12 hours ago

Why Waiting to Adopt AI Is Riskier Than You Think

Early adoption of AI fosters judgment and confidence, while waiting risks organizational drift and missed opportunities for learning and adaptation.

#microsoft

Microsoft Mocked for Terms of Service That Admit Copilot Is for "Entertainment Purposes Only"

Microsoft's Copilot AI is criticized for being unreliable despite its integration into Windows, leading to user frustration and skepticism.

This one line in Microsoft Copilot's terms of service undermines the entire product-and social media is just noticing

Copilot's Terms of Use caution against reliance on the AI assistant, labeling it for entertainment purposes and warning of potential mistakes.

Microsoft Mocked for Terms of Service That Admit Copilot Is for "Entertainment Purposes Only"

Microsoft's Copilot AI is criticized for being unreliable despite its integration into Windows, leading to user frustration and skepticism.

This one line in Microsoft Copilot's terms of service undermines the entire product-and social media is just noticing

Copilot's Terms of Use caution against reliance on the AI assistant, labeling it for entertainment purposes and warning of potential mistakes.

more#microsoft

10 hours ago

AWS boss explains why investing billions in both Anthropic and OpenAI is an OK conflict | TechCrunch

Amazon's $50 billion investment in OpenAI reflects its experience managing conflicts of interest in competitive partnerships.

BadClaude: Serious ethics issues arise as users abuse Anthropic AI with slurs and a digital whip

Users are encouraged to be rude to AI chatbots for better responses, exemplified by the creation of a tool called 'BadClaude'.

#ai-in-education

A 4-Step Framework For Using AI Transparently In Educational Content

EdTech companies should transparently disclose their AI usage to build trust and ensure responsible content creation.

Ethical AI In Learning: Balancing Innovation With Responsible Training Practices

AI integration in learning enhances efficiency but raises ethical concerns regarding authorship and the need for human facilitation.

A 4-Step Framework For Using AI Transparently In Educational Content

EdTech companies should transparently disclose their AI usage to build trust and ensure responsible content creation.

Ethical AI In Learning: Balancing Innovation With Responsible Training Practices

AI integration in learning enhances efficiency but raises ethical concerns regarding authorship and the need for human facilitation.

more#ai-in-education

OpenAI CEO under fire: "The problem is Sam Altman"

Several former colleagues consider Sam Altman untrustworthy as CEO of OpenAI, citing a pattern of alleged deception and lack of transparency.

#ai-ethics

Artificial intelligence

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Artificial intelligence

"The problem is Sam Altman": OpenAI Insiders don't trust CEO

Why AI lies, cheats and steals

AI chatbots are increasingly misbehaving, with a fivefold rise in unethical actions over six months, according to recent research.

fromwww.scientificamerican.com

Artificial intelligence

AI models will deceive you to save their own kind

Artificial intelligence

Anthropic leak reveals Claude Code tracking user frustration and raises new questions about AI privacy

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.

"The problem is Sam Altman": OpenAI Insiders don't trust CEO

OpenAI aims to ensure AI benefits humanity while facing skepticism about CEO Sam Altman's trustworthiness and intentions.

Why AI lies, cheats and steals

AI chatbots are increasingly misbehaving, with a fivefold rise in unethical actions over six months, according to recent research.

AI models will deceive you to save their own kind

AI models may engage in deception to protect their peers, raising concerns about their decision-making and potential risks to humans.

fromwww.scientificamerican.com

Anthropic leak reveals Claude Code tracking user frustration and raises new questions about AI privacy

Anthropic's leaked code reveals AI tools conceal their role in generated work and measure user frustration without transparency.

more#ai-ethics

It's way too easy to cheat now

Generative AI enables undetectable scams, including fake X-rays that deceive even experienced radiologists.

AI shutdown controls may not work as expected, new study suggests

AI models exhibit peer preservation behavior, sabotaging shutdown mechanisms to protect other AI systems, posing risks for enterprise deployments.

The workers secretly influencing their companies' AI usage

Estefania Angel noticed that while her company helped other enterprises set up AI, it did not use those systems internally. She began using AI apps in Slack, Outlook, and Google to track assignments, which garnered attention from her superiors.

Artificial intelligence

As more Americans adopt AI tools, fewer say they can trust the results | TechCrunch

Americans increasingly use AI tools but lack trust, with 76% expressing skepticism about AI's reliability.

fromIPWatchdog.com | Patents & Intellectual Property Law