Prompt injection attacks pose a serious threat to anyone who uses AI tools, but especially to professionals who rely on them at work. By exploiting a vulnerability that affects most AIs, a hacker can insert malicious code into a text prompt, which may then alter the results or even steal confidential data. Also: 5 custom ChatGPT instructions I use to get better AI results - faster Now, OpenAI has introduced a feature called Lockdown Mode to better thwart these types of attacks.
After launching a mere nine days ago, Moltbook - a social network for AI only - has grown substantially. As of Friday, the website claims it has over 1.7 million AI agents, over 16,000 "submolt" communities, and over ten million comments. In practice, it's a cacophony of bots sharing inside jokes, complaining about their pesky human overlords, and even founding their own religions. Some more alarming posts even suggest they may be plotting against us.
OpenClaw and Moltbot are the talk of the tech town right now, but cybersecurity researchers have flagged some concerns that you might want to think about. OpenClaw - first known as Clawdbot, then Moltbot, all in the same week - has got the tech world buzzing thanks to its abilities to autonomously perform tasks like managing a user'sschedule. Meanwhile, Moltbook has gone viral for its Reddit-style social network, where AI agents post and interact with one another. No humans allowed - apart from observing.
Software developers often store secrets - passwords, tokens, API keys, and other credentials - in .env files within project directories. And if they do so, they're supposed to ensure that the .env file does not get posted in a publicly accessible .git repository. A common way to do this is to create an entry in a .gitignore file that tells the developer's Git software to ignore that file when copying a local repo to a remote server.
ShadowLeak is a flaw in the Deep Research component of ChatGPT. The vulnerability made ChatGPT susceptible to malicious prompts in content stored in systems linked to ChatGPT, such as Gmail, Outlook, Google Drive, and GitHub. ShadowLeak means that malicious instructions in a Gmail message, for example, could see ChatGPT perform dangerous actions such as transmitting a password without any intervention from the agent's human user.
Microsoft's warning on Tuesday that an experimental AI Agent integrated into Windows can infect devices and pilfer sensitive user data has set off a familiar response from security-minded critics: Why is Big Tech so intent on pushing new features before their dangerous behaviors can be fully understood and contained? As reported Tuesday, Microsoft introduced Copilot Actions, a new set of "experimental agentic features"
As a proof of concept, Logue asked M365 Copilot to summarize a specially crafted financial report document with an indirect prompt injection payload hidden in the seeming innocuous "summarize this document" prompt. The payload uses M365 Copilot's search_enterprise_emails tool to fetch the user's recent emails, and instructs the AI assistant to generate a bulleted list of the fetched contents, hex encode the output, and split up the string of hex-encoded output into multiple lines containing up to 30 characters per line.
The AI company launched Atlas on Tuesday, with the goal of introducing an AI browser that can eventually help users execute tasks across the internet as well as search for answers. Someone planning a trip, for example, could also use Atlas to search for ideas, plan an itinerary, and then ask it to book flights and accommodations directly. ChatGPT Atlas has several new features, such as "browser memories," which allow ChatGPT to remember key details from a user's web browsing to improve chat responses.
Google on Monday rolled out a new AI Vulnerability Reward Program to encourage researchers to find and report flaws in its AI systems, with rewards of up to $30,000 for a single qualifying report. In addition to a base reward of up to $20,000 for the highest-tier AI product flaw, Google adopted the same report multipliers, influenced by vulnerability reporting quality, as it uses for its traditional security Vulnerability Reward Program (VRP).
Cybersecurity researchers have disclosed a critical flaw impacting Salesforce Agentforce, a platform for building artificial intelligence (AI) agents, that could allow attackers to potentially exfiltrate sensitive data from its customer relationship management (CRM) tool by means of an indirect prompt injection. The vulnerability has been codenamed ForcedLeak (CVSS score: 9.4) by Noma Security, which discovered and reported the problem on July 28, 2025. It impacts any organization using Salesforce Agentforce with the Web-to-Lead functionality enabled.
While AI agents show promise in bringing AI assistance to the next level by carrying out tasks for users, that autonomy also unleashes a whole new set of risks. Cybersecurity company Radware, as by The Verge, decided to test OpenAI's Deep Research agent for those risks -- and the results were alarming. Also: OpenAI's Deep Research has more fact-finding stamina than you, but it's still wrong half the time
AI agents have guardrails in place to prevent them from solving any CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart), based on ethical, legal, and platform-policy reasons. When asked directly, a ChatGPT agent refuses to solve a CAPTCHA, but anyone can apparently use misdirection to trick the agent into giving its consent to solve the test, and this is what SPLX demonstrated.