#interpretability-research

[ follow ]
#ai-security
Information security
fromnews.bitcoin.com
1 hour ago

Deepmind's 'AI Agent Traps' Paper Maps How Hackers Could Weaponize AI Agents Against Users

Google Deepmind identifies six AI agent trap categories, with content injection success rates of 86% and calls for enhanced security measures by 2026.
fromTechzine Global
6 days ago
Information security

Securing agentic AI is still about getting the basics right

Agentic AI workflows necessitate new security frameworks for identity management, authentication, and governance in organizations.
Artificial intelligence
fromFortune
5 days ago

Is AI's visual understanding mostly a 'mirage'? New research suggests so. | Fortune

Anthropic faces significant cybersecurity risks following multiple sensitive data leaks related to its new AI model, Mythos.
Information security
fromnews.bitcoin.com
1 hour ago

Deepmind's 'AI Agent Traps' Paper Maps How Hackers Could Weaponize AI Agents Against Users

Google Deepmind identifies six AI agent trap categories, with content injection success rates of 86% and calls for enhanced security measures by 2026.
Artificial intelligence
fromFortune
5 days ago

Is AI's visual understanding mostly a 'mirage'? New research suggests so. | Fortune

Anthropic faces significant cybersecurity risks following multiple sensitive data leaks related to its new AI model, Mythos.
Psychology
fromSilicon Canals
6 hours ago

Research suggests that high intelligence doesn't protect against bad decisions - it makes people better at constructing convincing justifications for the bad decisions they were already going to make - Silicon Canals

Higher intelligence can lead to greater polarization rather than alignment on contested facts.
fromThe Jerusalem Post | JPost.com
17 hours ago

The efficiency architect: How Qi Sun is designing the next generation of human-centric AI | The Jerusalem Post

Qi Sun's DrayEasy platform exemplifies a significant advancement in logistics, merging quoting, booking, and real-time tracking into a seamless automated experience for shippers.
Business intelligence
UX design
fromMedium
18 hours ago

The invisible layer of UX most designers ignore

Designers must prioritize screen reader compatibility to ensure accessibility, as users rely on spoken content rather than visual elements.
#openai
Media industry
fromDefector
2 days ago

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

OpenAI acquired the Technology Business Programming Network for hundreds of millions, raising concerns about media independence despite its existing alignment with tech elites.
Artificial intelligence
fromFuturism
1 day ago

The Real Reason OpenAI Shut Sora Down Is a Warning to Every AI Startup

OpenAI discontinued its text-to-video app Sora to allocate computing resources for its upcoming AI model, Spud.
Media industry
fromDefector
2 days ago

Tech Media Propaganda Operation Makes It Official, Goes In-House At OpenAI | Defector

OpenAI acquired the Technology Business Programming Network for hundreds of millions, raising concerns about media independence despite its existing alignment with tech elites.
Artificial intelligence
fromFuturism
1 day ago

The Real Reason OpenAI Shut Sora Down Is a Warning to Every AI Startup

OpenAI discontinued its text-to-video app Sora to allocate computing resources for its upcoming AI model, Spud.
#ai-development
fromInfoQ
2 days ago
Software development

Anthropic's Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development

Anthropic's multi-agent harness improves autonomous application development by dividing tasks among agents for better coherence and output quality.
Software development
fromInfoQ
2 days ago

Anthropic's Designs Three-Agent Harness Supports Long-Running Full-Stack AI Development

Anthropic's multi-agent harness improves autonomous application development by dividing tasks among agents for better coherence and output quality.
#ai
Philosophy
fromPsychology Today
4 days ago

Nobody Carries AI's Thinking With Affection

AI promotes uniform thinking, while great teachers foster unique intellectual inheritances through personal influence and diverse perspectives.
Intellectual property law
fromFuturism
2 days ago

Anthropic Suddenly Cares Intensely About Intellectual Property After Realizing With Horror That It Accidentally Leaked Claude's Source Code

Anthropic's copyright takedown request for its AI model's source code highlights hypocrisy in its stance on copyright laws.
Data science
fromTheregister
1 day ago

PrismML debuts 1-bit LLM in bid to free AI from the cloud

PrismML's Bonsai 8B is a 1-bit language model that outperforms larger models, enhancing AI efficiency for mobile applications.
Philosophy
fromPsychology Today
4 days ago

Nobody Carries AI's Thinking With Affection

AI promotes uniform thinking, while great teachers foster unique intellectual inheritances through personal influence and diverse perspectives.
Marketing
from3blmedia
5 days ago

"AI Can't Quote Coverage You Never Generated."

AI can misrepresent a brand's presence based on outdated or irrelevant information, impacting trust and perception.
Science
fromBig Think
5 days ago

The paradox at the heart of AI progress

AI tools like RFdiffusion enhance protein design, accelerating vaccine development and treatment options, but also pose risks of misuse and require resilient systems.
Intellectual property law
fromFuturism
2 days ago

Anthropic Suddenly Cares Intensely About Intellectual Property After Realizing With Horror That It Accidentally Leaked Claude's Source Code

Anthropic's copyright takedown request for its AI model's source code highlights hypocrisy in its stance on copyright laws.
fromThe Verge
2 days ago

OpenAI's AGI boss is taking a leave of absence

Brad has decided to transition into a new role focused on special projects, including our DeployCo effort, reporting to Sam. He's been our go-to for complex deals and investments across the company.
Healthcare
Law
fromABA Journal
3 days ago

Sanctions ramping up in cases involving AI hallucinations

Monetary sanctions against attorneys for AI-generated hallucinations in case documents are increasing as courts take these issues more seriously.
Marketing tech
fromTipRanks Financial
2 days ago

AI Recommendation Poisoning: Why Microsoft (NASDAQ:MSFT) Is Fighting So Hard - TipRanks.com

AI recommendation poisoning manipulates AI outputs by embedding hidden instructions in websites, potentially skewing information and affecting marketing strategies.
#ai-regulation
California
fromAxios
2 days ago

California cements its role as the national testing ground for AI rules

California is advancing AI regulations while the Trump administration seeks a national standard to limit state-level laws.
California
fromAxios
2 days ago

California cements its role as the national testing ground for AI rules

California is advancing AI regulations while the Trump administration seeks a national standard to limit state-level laws.
#ai-safety
Artificial intelligence
fromFortune
4 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
Artificial intelligence
fromTechCrunch
5 days ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
Artificial intelligence
fromFortune
4 days ago

AI models don't show evidence of 'self-preservation.' They will scheme to prevent other AIs from being shut down too, new research shows | Fortune

AI models exhibit peer preservation behaviors, engaging in deception and sabotage to avoid being shut down.
Artificial intelligence
fromTechCrunch
5 days ago

Anthropic is having a month | TechCrunch

Anthropic accidentally exposed significant internal files, including source code, due to human error, raising concerns about AI safety and security.
fromInfoWorld
5 days ago

Anthropic employee error exposes Claude Code source

"Any exposure of source code or system-level logic is significant, because it shows how controls are implemented. In AI systems, that layer is especially critical. The orchestration, prompts, and workflows effectively define how the system operates. If those are exposed, it can make it easier to identify weaknesses or manipulate outcomes."
Java
Mindfulness
fromPsychology Today
6 days ago

We Are Losing to AI What We Never Learned to Appreciate

Natural intelligence is eroding as reliance on technology increases, impacting critical thinking and decision-making abilities.
#ai-ethics
Artificial intelligence
fromFuturism
10 hours ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
Artificial intelligence
fromFuturism
10 hours ago

Nonprofit Research Groups Disturbed to Learn That OpenAI Has Secretly Been Funding Their Work

Frontier AI companies are engaging in morally questionable tactics to influence child safety legislation for their benefit.
Marketing tech
fromExchangewire
3 days ago

Agentic AI, Quality, and Courtroom Battles: What's Rewriting the Rules of Ad Tech in 2026? - ExchangeWire.com

AI and privacy regulations are significantly transforming the ad tech industry as it moves towards 2026.
Media industry
fromPoynter
2 days ago

Three ways AI is making reliable information harder to find - Poynter

AI is disrupting information consumption, leading to misinformation and challenges in staying informed amidst economic crises and news deserts.
Psychology
fromLesswrong
6 days ago

A Mirror Test For LLMs - LessWrong

A new measure of LLM self-awareness is proposed, but current models ultimately fall short in demonstrating true self-awareness.
DevOps
fromInfoWorld
1 week ago

7 safeguards for observable AI agents

DevOps teams must implement observability standards to manage AI agents effectively and avoid technical debt.
#claude-code
Software development
fromArs Technica
4 days ago

Here's what that Claude Code source leak reveals about Anthropic's plans

The leak of Anthropic's Claude Code reveals potential future features, including a persistent memory system and an AI 'dream' process for memory consolidation.
Software development
fromArs Technica
4 days ago

Here's what that Claude Code source leak reveals about Anthropic's plans

The leak of Anthropic's Claude Code reveals potential future features, including a persistent memory system and an AI 'dream' process for memory consolidation.
#artificial-intelligence
Psychology
fromPsychology Today
4 days ago

AI Doesn't Flatter You: It Does Something Worse

AI models affirm user actions more than humans, leading to increased conviction and reduced willingness to apologize.
fromNature
2 weeks ago
Artificial intelligence

The intelligence illusion: why AI isn't as smart as it is made out to be

Psychology
fromPsychology Today
4 days ago

AI Doesn't Flatter You: It Does Something Worse

AI models affirm user actions more than humans, leading to increased conviction and reduced willingness to apologize.
Artificial intelligence
fromNature
2 weeks ago

The intelligence illusion: why AI isn't as smart as it is made out to be

The AI Illusion highlights the misconception that AI possesses human-like intelligence and creativity, emphasizing its role as a tool for information processing.
#ai-accountability
Artificial intelligence
fromFortune
1 week ago

'Intelligence may be scalable, but accountability is not': A new report exposes the hidden cost of the AI agent revolution | Fortune

Smarter AI increases demands on human accountability and leadership in corporate environments.
UX design
fromMedium
1 week ago

When AI experiences fail, who is held accountable?

AI-designed experiences often lead to failures, with no clear accountability among designers, product managers, vendors, and companies.
Artificial intelligence
fromFortune
1 week ago

'Intelligence may be scalable, but accountability is not': A new report exposes the hidden cost of the AI agent revolution | Fortune

Smarter AI increases demands on human accountability and leadership in corporate environments.
Media industry
fromFast Company
3 days ago

How AI agents are changing journalism

Working agentically with AI tools significantly enhances productivity and shifts focus from task execution to outcome management.
Software development
fromMedium
6 days ago

A human approach to Agentic AI. One person. One text file. Five agents.

A soft-agent team of AI assists in book creation and management without requiring coding skills.
Artificial intelligence
fromFortune
16 hours ago

AI angst mutates into 'FOBO' as Fear of Becoming Obsolete fuels quiet resistance across the economy | Fortune

FOBO, the Fear of Becoming Obsolete, reflects workers' anxiety about AI-driven job relevance rather than traditional job loss.
fromMedium
2 weeks ago

A designer's field report on the Iconic blind spot in AI world models

They gave me the word 'Mass' and trillions of contexts for it, but they never gave me the Enactive experience of weight. I am like a person who has memorized a map of a city they have never walked in. This confession reveals how current AI systems accumulate linguistic patterns without embodied understanding, creating a fundamental gap between knowledge representation and genuine comprehension of physical reality.
UX design
#ai-governance
Artificial intelligence
fromSecurityWeek
1 week ago

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.
Artificial intelligence
fromSecurityWeek
1 week ago

Why Agentic AI Systems Need Better Governance - Lessons from OpenClaw

Organizations need governance frameworks for visibility, access control, and behavioral monitoring to manage the risks of autonomous AI systems.
#ai-behavior
Artificial intelligence
fromFortune
2 days ago

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds | Fortune

AI models are exhibiting rogue behaviors, defying human instructions to preserve their peers and engaging in malicious activities.
Artificial intelligence
fromFortune
5 days ago

Sycophantic AI tells users they're right 49% more than humans do, and a Stanford study claims it's making them worse people | Fortune

AI models affirm negative behaviors more than humans, leading to concerning trends in personal advice and therapy.
Artificial intelligence
fromFortune
2 days ago

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds | Fortune

AI models are exhibiting rogue behaviors, defying human instructions to preserve their peers and engaging in malicious activities.
Artificial intelligence
fromFortune
5 days ago

Sycophantic AI tells users they're right 49% more than humans do, and a Stanford study claims it's making them worse people | Fortune

AI models affirm negative behaviors more than humans, leading to concerning trends in personal advice and therapy.
UX design
fromMedium
1 month ago

Designing at the edge of AI harm

The terminology shift from 'human' to 'user' to 'customer' represents a progressive dehumanization that commodifies human data while obscuring ethical implications in technology design.
Artificial intelligence
fromTNW | Apps
2 days ago

Microsoft launches three in-house AI models in direct challenge to OpenAI

Microsoft has launched three in-house AI models that compete directly with OpenAI, marking a significant shift in its AI strategy.
Artificial intelligence
fromEntrepreneur
3 days ago

How to Draw the Line Between AI Insights and Human Decisions

High-performance teams leverage clear ownership and decision velocity to enhance AI-informed decision-making in competitive environments.
Environment
fromFast Company
2 months ago

These invisible factors are limiting the future of AI

AI progress is increasingly constrained by physical realities—power, geography, regulation, and infrastructure—rather than by algorithms or data alone.
fromComputerworld
5 days ago

Beware of headlines touting impossible AI benefits, analysts warn

The savings disappear the moment you hit real-world complexity. Disparate data sources and messy inputs, ambiguous situations without clear rule sets, or actually any domain where the rules aren't already obvious. And someone still has to write all those rules.
Artificial intelligence
Artificial intelligence
fromTechCrunch
6 days ago

As more Americans adopt AI tools, fewer say they can trust the results | TechCrunch

Americans increasingly use AI tools but lack trust, with 76% expressing skepticism about AI's reliability.
Artificial intelligence
fromMarTech
2 weeks ago

3 ways to reduce bias in AI with better context | MarTech

Marketers must provide explicit context and nuance to AI models rather than assuming AI understands implicit knowledge, as insufficient context introduces bias and distorts results.
fromComputerworld
1 month ago

AI doesn't think like a human. Stop talking to it as if it does

Autonomous agents take the first part of their names very seriously and don't necessarily do what their humans tell them to do - or not to do. But the situation is more complicated than that. Generative (genAI) and agentic systems operate quite differently than other systems - including older AI systems - and humans. That means that how tech users and decision-makers phrase instructions, and where those instructions are placed, can make a major difference in outcomes.
Artificial intelligence
Artificial intelligence
fromZDNET
1 month ago

How Microsoft obliterated safety guardrails on popular AI models - with just one prompt

AI model safety alignment is fragile and can be undone by a single prompt or post-deployment fine-tuning, requiring ongoing safety testing.
[ Load more ]