In the document, Anthropic researchers reported finding that Claude "occasionally voices discomfort with the aspect of being a product," and when asked, would assign itself a "15 to 20 percent probability of being conscious under a variety of prompting conditions." "Suppose you have a model that assigns itself a 72 percent chance of being conscious," Douthat began. "Would you believe it?"
Before its debut, Anthropic's frontier red team tested Opus 4.6 in a sandboxed environment to see how well it could find bugs in open-source code. The team gave the Claude model everything it needed to do the job - access to Python and vulnerability analysis tools, including classic debuggers and fuzzers - but no specific instructions or specialized knowledge. Claude found more than 500 previously unknown zero-day vulnerabilities in open-source code using just its "out-of-the-box" capabilities,