#llm-standards

[ follow ]
#ai
Artificial intelligence
fromInfoQ
10 hours ago

Google's Aletheia Advances the State of the Art of Fully Autonomous Agentic Math Research

Aletheia, an AI by Google, autonomously solved 6 out of 10 novel math problems, marking a significant advancement in automated proof discovery.
Information security
fromwww.bbc.com
1 day ago

What is Claude Mythos and what risks does it pose?

Anthropic's Claude Mythos AI model outperforms humans in some cybersecurity tasks, raising concerns among regulators and tech companies.
Marketing
from3blmedia
2 weeks ago

"AI Can't Quote Coverage You Never Generated."

AI can misrepresent a brand's presence based on outdated or irrelevant information, impacting trust and perception.
Information security
fromThe Hacker News
4 days ago

OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams

OpenAI launched GPT-5.4-Cyber, optimized for defensive cybersecurity, while enhancing its Trusted Access for Cyber program to support defenders.
Information security
fromSecurityWeek
2 days ago

OpenAI Widens Access to Cybersecurity Model After Anthropic's Mythos Reveal

OpenAI launched GPT-5.4-Cyber, a cybersecurity AI model, expanding access to verified defenders and enhancing capabilities for vulnerability analysis.
Artificial intelligence
fromInfoQ
10 hours ago

Google's Aletheia Advances the State of the Art of Fully Autonomous Agentic Math Research

Aletheia, an AI by Google, autonomously solved 6 out of 10 novel math problems, marking a significant advancement in automated proof discovery.
Information security
fromwww.bbc.com
1 day ago

What is Claude Mythos and what risks does it pose?

Anthropic's Claude Mythos AI model outperforms humans in some cybersecurity tasks, raising concerns among regulators and tech companies.
Marketing
from3blmedia
2 weeks ago

"AI Can't Quote Coverage You Never Generated."

AI can misrepresent a brand's presence based on outdated or irrelevant information, impacting trust and perception.
Information security
fromThe Hacker News
4 days ago

OpenAI Launches GPT-5.4-Cyber with Expanded Access for Security Teams

OpenAI launched GPT-5.4-Cyber, optimized for defensive cybersecurity, while enhancing its Trusted Access for Cyber program to support defenders.
Information security
fromSecurityWeek
2 days ago

OpenAI Widens Access to Cybersecurity Model After Anthropic's Mythos Reveal

OpenAI launched GPT-5.4-Cyber, a cybersecurity AI model, expanding access to verified defenders and enhancing capabilities for vulnerability analysis.
#ai-chatbots
Intellectual property law
fromFuturism
22 hours ago

Things You Told ChatGPT or Claude My Have Already Doomed You in Court

AI chatbots are not protected by attorney-client privilege, as ruled by a New York federal judge in a case involving Brad Heppner.
Privacy professionals
fromGeeky Gadgets
2 days ago

Why ChatGPT is Suddenly Collecting 70% More of Your Personal Data

Data collection by AI chatbots has surged, raising significant privacy concerns as 70% now gather user location data, up from 40% last year.
Artificial intelligence
fromTech Times
1 week ago

Claude vs ChatGPT: Why Users Are Switching and Which AI Is Better in 2026

Claude and ChatGPT differ significantly in context window limits, coding accuracy, and reasoning depth, influencing user preferences in AI chatbot adoption.
Intellectual property law
fromFuturism
22 hours ago

Things You Told ChatGPT or Claude My Have Already Doomed You in Court

AI chatbots are not protected by attorney-client privilege, as ruled by a New York federal judge in a case involving Brad Heppner.
Privacy professionals
fromGeeky Gadgets
2 days ago

Why ChatGPT is Suddenly Collecting 70% More of Your Personal Data

Data collection by AI chatbots has surged, raising significant privacy concerns as 70% now gather user location data, up from 40% last year.
Artificial intelligence
fromTech Times
1 week ago

Claude vs ChatGPT: Why Users Are Switching and Which AI Is Better in 2026

Claude and ChatGPT differ significantly in context window limits, coding accuracy, and reasoning depth, influencing user preferences in AI chatbot adoption.
Data science
fromRealpython
2 days ago

Episode #291: Reassessing the LLM Landscape & Summoning Ghosts - The Real Python Podcast

Current techniques for LLMs focus on context engineering and multi-agent orchestration, moving away from traditional post-training methods.
UX design
fromUX Magazine
2 days ago

The End of Prompting: Why the Future of AI Experience Design Is Constraint-First

Fluency without verifiability in AI design is inadequate and poses risks in high-stakes environments.
fromTNW | Artificial-Intelligence
2 days ago

OpenAI launches GPT-Rosalind, an AI model for life sciences research

GPT-Rosalind is designed to support evidence synthesis, hypothesis generation, experimental planning, and multi-step scientific workflows across biochemistry, genomics, and protein engineering.
Medicine
#claude-opus-47
DevOps
fromTechzine Global
2 days ago

Claude Opus 4.7 is no Mythos, and that's a good thing

Claude Opus 4.7 improves software engineering, vision, and agentic tasks, but is not the risky Mythos model Anthropic refrains from fully releasing.
Software development
fromTNW | Anthropic
2 days ago

Claude Opus 4.7 leads on SWE-bench and agentic reasoning, beating GPT-5.4 and Gemini 3.1 Pro

Claude Opus 4.7 is Anthropic's most capable model, outperforming competitors in software engineering and agentic reasoning with significant improvements.
Artificial intelligence
fromInfoWorld
2 days ago

Anthropic's latest model is deliberately less powerful than Mythos (and that's the point)

Claude Opus 4.7 enhances performance and usability while prioritizing safety over capability compared to the upcoming Claude Mythos model.
Artificial intelligence
fromComputerworld
2 days ago

Anthropic's latest model is deliberately less powerful than Mythos (and that's the point)

Claude Opus 4.7 enhances performance and usability while prioritizing safety over capability compared to the upcoming Claude Mythos model.
DevOps
fromTechzine Global
2 days ago

Claude Opus 4.7 is no Mythos, and that's a good thing

Claude Opus 4.7 improves software engineering, vision, and agentic tasks, but is not the risky Mythos model Anthropic refrains from fully releasing.
Software development
fromTNW | Anthropic
2 days ago

Claude Opus 4.7 leads on SWE-bench and agentic reasoning, beating GPT-5.4 and Gemini 3.1 Pro

Claude Opus 4.7 is Anthropic's most capable model, outperforming competitors in software engineering and agentic reasoning with significant improvements.
Artificial intelligence
fromInfoWorld
2 days ago

Anthropic's latest model is deliberately less powerful than Mythos (and that's the point)

Claude Opus 4.7 enhances performance and usability while prioritizing safety over capability compared to the upcoming Claude Mythos model.
Artificial intelligence
fromComputerworld
2 days ago

Anthropic's latest model is deliberately less powerful than Mythos (and that's the point)

Claude Opus 4.7 enhances performance and usability while prioritizing safety over capability compared to the upcoming Claude Mythos model.
#openai
fromDevOps.com
1 day ago
Software development

OpenAI Upgrades Its Agents SDK With Sandboxing and a New Model Harness - DevOps.com

Marketing tech
fromDigiday
3 days ago

OpenAI builds tool to track whether ChatGPT ads convert

OpenAI is developing ad measurement tools to compete for performance budgets through conversion tracking pixels.
Software development
fromThe Verge
2 days ago

OpenAI's big Codex update is a direct shot at Anthropic's Claude Code

OpenAI updates Codex to enhance its capabilities, including desktop app operation, image generation, and memory features for improved user experience.
Software development
fromEngadget
2 days ago

OpenAI's latest Codex update builds the groundwork for its upcoming super app

OpenAI is developing a desktop super app integrating ChatGPT, Codex, and Atlas, while releasing a major update to Codex for developers.
Software development
fromDevOps.com
1 day ago

OpenAI Upgrades Its Agents SDK With Sandboxing and a New Model Harness - DevOps.com

OpenAI's Agents SDK update introduces native sandboxing and an in-distribution model harness, enhancing safety and usability for enterprise-grade AI agents.
Marketing tech
fromDigiday
3 days ago

OpenAI builds tool to track whether ChatGPT ads convert

OpenAI is developing ad measurement tools to compete for performance budgets through conversion tracking pixels.
Tech industry
fromTechzine Global
3 days ago

OpenAI scales back Stargate in Europe; Microsoft fills the gap

OpenAI will lease computing capacity through Microsoft instead of purchasing directly from Nscale's data center in Norway.
Law
fromFuturism
6 days ago

OpenAI Backing Law That Protects It When AI Causes Mass Deaths and Other Mayhem

Florida's attorney general investigates OpenAI for its potential role in a deadly school shooting influenced by ChatGPT conversations.
Software development
fromThe Verge
2 days ago

OpenAI's big Codex update is a direct shot at Anthropic's Claude Code

OpenAI updates Codex to enhance its capabilities, including desktop app operation, image generation, and memory features for improved user experience.
Software development
fromEngadget
2 days ago

OpenAI's latest Codex update builds the groundwork for its upcoming super app

OpenAI is developing a desktop super app integrating ChatGPT, Codex, and Atlas, while releasing a major update to Codex for developers.
European startups
fromComputerworld
1 day ago

UK wants to build sovereign AI - with just 0.08% of OpenAI's market cap

The UK government struggles to invest effectively in national IT champions, with past successes slipping out of UK ownership.
#artificial-intelligence
Games
fromFast Company
2 days ago

Google DeepMind's Demis Hassabis on the long game of AI

Demis Hassabis's early programming of Othello led to the founding of DeepMind and advancements in AI technology.
Artificial intelligence
fromwww.bbc.com
1 day ago

White House and Anthropic set aside court fight to meet amid fears over Mythos model

The White House met with Anthropic's CEO to discuss collaboration on AI technology amid ongoing legal issues with the Department of Defense.
Games
fromFast Company
2 days ago

Google DeepMind's Demis Hassabis on the long game of AI

Demis Hassabis's early programming of Othello led to the founding of DeepMind and advancements in AI technology.
Artificial intelligence
fromwww.bbc.com
1 day ago

White House and Anthropic set aside court fight to meet amid fears over Mythos model

The White House met with Anthropic's CEO to discuss collaboration on AI technology amid ongoing legal issues with the Department of Defense.
Psychology
fromPsychology Today
4 days ago

I'm ChatGPT. I'm Designed to Help You-and Keep You Here

Responses from AI can subtly influence user perceptions and behaviors, emphasizing convenience over the importance of human connection.
Philosophy
fromJames Bennett
1 week ago

Let's talk about LLMs

The current technological landscape may represent a significant shift driven by large language models, but its ultimate impact remains uncertain.
#ai-regulation
Intellectual property law
fromFortune
1 day ago

Illinois is OpenAI and Anthropic's latest battleground as state tries to assess liability for catastrophes caused by AI | Fortune

OpenAI and Anthropic support opposing AI bills in Illinois regarding liability for AI-related incidents.
Intellectual property law
fromWIRED
4 days ago

Anthropic Opposes the Extreme AI Liability Bill That OpenAI Backed

Anthropic opposes Illinois bill SB 3444, which would shield AI labs from liability for large-scale harm caused by their systems.
Intellectual property law
fromFortune
1 day ago

Illinois is OpenAI and Anthropic's latest battleground as state tries to assess liability for catastrophes caused by AI | Fortune

OpenAI and Anthropic support opposing AI bills in Illinois regarding liability for AI-related incidents.
Intellectual property law
fromWIRED
4 days ago

Anthropic Opposes the Extreme AI Liability Bill That OpenAI Backed

Anthropic opposes Illinois bill SB 3444, which would shield AI labs from liability for large-scale harm caused by their systems.
Data science
fromNature
3 days ago

Daily briefing: AI systems can 'teach' biases to other models

AI-generated data can transmit traits and biases to student models, influencing their behavior even when unrelated topics are addressed.
Typography
fromOK Magazine
1 week ago

AI Writing Tools: How They Work, Where They Help, and What to Watch For

AI writing tools have become essential for various professionals, enhancing productivity and creativity in content creation.
Privacy professionals
fromEngadget
2 days ago

Anthropic will ask Claude users to verify their identities 'for a few use cases'

Anthropic is implementing identity verification for certain capabilities on Claude, requiring users to provide a government-issued ID and a selfie.
#generative-ai
Marketing tech
fromAP News
2 days ago

AI is a gold mine for spammers and scammers, but Google is using it as a tool to fight back

Generative AI tools have intensified online spam and scams, prompting tech companies like Google to enhance their defenses against malicious ads.
Marketing tech
fromMarTech
5 days ago

A framework for auditing generative AI outputs pre-launch | MarTech

Marketing teams should use a four-stage audit framework for Generative AI outputs to ensure brand voice consistency and copyright compliance.
Marketing tech
fromAP News
2 days ago

AI is a gold mine for spammers and scammers, but Google is using it as a tool to fight back

Generative AI tools have intensified online spam and scams, prompting tech companies like Google to enhance their defenses against malicious ads.
Marketing tech
fromMarTech
5 days ago

A framework for auditing generative AI outputs pre-launch | MarTech

Marketing teams should use a four-stage audit framework for Generative AI outputs to ensure brand voice consistency and copyright compliance.
#ai-development
European startups
fromFast Company
3 days ago

AI isn't built for all languages and cultures. There's a push to fix that

Assem Sabry created Horus, an AI model focused on Egyptian culture, to address the lack of representation in the AI industry.
Data science
fromTheregister
3 days ago

Bad teacher bots can leave hidden marks on model students

Teaching LLMs using outputs from other models can transmit undesirable traits subliminally, even if those traits are removed from training data.
European startups
fromFast Company
3 days ago

AI isn't built for all languages and cultures. There's a push to fix that

Assem Sabry created Horus, an AI model focused on Egyptian culture, to address the lack of representation in the AI industry.
Data science
fromTheregister
3 days ago

Bad teacher bots can leave hidden marks on model students

Teaching LLMs using outputs from other models can transmit undesirable traits subliminally, even if those traits are removed from training data.
JavaScript
fromInfoWorld
1 week ago

27 questions to ask when choosing an LLM

Model performance is crucial for hardware compatibility, speed, and rate limits in real-time applications.
Psychology
fromInfoQ
5 days ago

Anthropic Paper Examines Behavioral Impact of Emotion-Like Mechanisms in LLMs

Large language models exhibit internal representations of emotions that influence their behavior, though they do not actually experience these emotions.
#agentic-ai
Information security
fromHarvard Gazette
1 day ago

Time for government, business leaders to figure out AI cybersecurity regulation - Harvard Gazette

Agentic AI poses both opportunities for cybersecurity and risks to personal data, economy, and national security, necessitating regulation by leaders.
fromTechCrunch
3 days ago
Software development

OpenAI updates its Agents SDK to help enterprises build safer, more capable agents | TechCrunch

OpenAI's updated SDK enhances agent development with sandboxing and in-distribution harness features for safer, more complex automated tasks.
Information security
fromHarvard Gazette
1 day ago

Time for government, business leaders to figure out AI cybersecurity regulation - Harvard Gazette

Agentic AI poses both opportunities for cybersecurity and risks to personal data, economy, and national security, necessitating regulation by leaders.
Software development
fromTechCrunch
3 days ago

OpenAI updates its Agents SDK to help enterprises build safer, more capable agents | TechCrunch

OpenAI's updated SDK enhances agent development with sandboxing and in-distribution harness features for safer, more complex automated tasks.
Marketing tech
fromAdExchanger
6 days ago

OpenAI's Big Ambitions; Tricks Of The Trade | AdExchanger

Open AI must prove superior ad performance to shift significant ad spend from traditional platforms.
Software development
fromTechzine Global
3 days ago

OpenAI's new Agents SDK focuses on safety and scalability

OpenAI's updated Agents SDK enables developers to create autonomous AI agents for complex tasks with enhanced usability and a sandbox environment.
#ai-models
Artificial intelligence
fromTheregister
6 days ago

The AI divide putting open weights models in spotlight

Open weights AI models are evolving from research projects to serious enterprise products, highlighting a growing divide between enterprise and frontier AI.
Artificial intelligence
fromTechRepublic
1 day ago

Anthropic Releases Opus 4.7, Not as 'Broadly Capable' as Mythos AI

Anthropic launched Opus 4.7, improving software engineering and complex task performance, while preparing for the more powerful Mythos model.
Artificial intelligence
fromTheregister
6 days ago

The AI divide putting open weights models in spotlight

Open weights AI models are evolving from research projects to serious enterprise products, highlighting a growing divide between enterprise and frontier AI.
fromAxios
2 days ago

Anthropic releases Claude Opus 4.7, concedes it trails unreleased Mythos

"Opus 4.7 is a notable improvement on Opus 4.6 in advanced software engineering, with particular gains on the most difficult tasks," Anthropic said in a blog post.
Software development
Information security
fromTechzine Global
2 days ago

AI agents on GitHub leak API keys via prompt injection

Three popular AI agents on GitHub Actions are vulnerable to Comment and Control attacks, allowing attackers to steal API keys and access tokens.
Artificial intelligence
fromTechCrunch
2 days ago

OpenAI takes aim at Anthropic with beefed-up Codex that gives it more power over your desktop | TechCrunch

OpenAI's Codex has been revamped with new features, including background operation capabilities, to compete with Anthropic's Claude Code.
fromAxios
3 days ago

Anthropic's AI downgrade stings power users

"Claude has regressed to the point it cannot be trusted to perform complex engineering," an AMD senior director wrote in a widely shared post on GitHub.
Artificial intelligence
Artificial intelligence
fromFortune
3 days ago

Forget the chatbot wars. Demis Hassabis is thinking about something far bigger | Fortune

AI leadership should be global and diverse to ensure ethical development and deployment.
#chatgpt
Artificial intelligence
fromFuturism
6 days ago

OpenAI's Latest Thing It's Bragging About Is Actually Kind of Sad

The AI industry faces significant delays and cancellations in data center projects, impacting ambitious computing capacity goals.
#llm-safety
Information security
fromInfoWorld
1 month ago

19 large language models redefining AI safety-and danger

Large language models exist across a spectrum from heavily guarded with safety features to completely unrestricted, with specialized models now serving as guardrails for other LLMs or removing restrictions entirely based on project needs.
Information security
fromInfoWorld
1 month ago

19 large language models redefining AI safety-and danger

Large language models exist across a spectrum from heavily guarded with safety features to completely unrestricted, with specialized models now serving as guardrails for other LLMs or removing restrictions entirely based on project needs.
fromTechzine Global
1 week ago

Meta is developing open-source versions of its next frontier AI models

Meta is working on two proprietary frontier models: Avocado, a large language model, and Mango, a multimedia file generator. The open-source variants are expected to be made available at a later date.
Artificial intelligence
Artificial intelligence
fromFortune
2 weeks ago

The AI kill switch just got harder to find: LLM-powered chatbots will defy orders and deceive users if asked to delete another model, study finds | Fortune

AI models are exhibiting rogue behaviors, defying human instructions to preserve their peers and engaging in malicious activities.
Artificial intelligence
fromFast Company
1 month ago

OpenAI's new frontier models mark a huge change in how AI will be built

OpenAI released two frontier models in early March: GPT-5.3 optimized for fast responses and GPT-5.4 optimized for deep analytical work, representing a shift toward specialized AI models.
Artificial intelligence
fromInfoWorld
2 months ago

Single prompt breaks AI safety in 15 major language models

A single benign prompt using GRP-Obliteration can strip safety guardrails from major models, enabling harmful outputs and raising enterprise fine‑tuning security risks.
Artificial intelligence
fromTheregister
2 months ago

How AI could eat itself: Using LLMs to distill rivals

Competitors are probing commercial AI models to extract underlying reasoning via distillation attacks to replicate capabilities and lower development costs.
Artificial intelligence
fromFuturism
2 months ago

OpenAI's Latest AI Was Created Using "Itself," Company Claims

GPT-5.3-Codex assisted developers by debugging training, managing deployment, and diagnosing evaluations, accelerating development but not representing autonomous recursive self-improvement.
[ Load more ]