Claude Mythos: Substance vs Coverage

What Anthropic Announced

On April 7, Anthropic announced Claude Mythos Preview, a frontier model so capable at finding software vulnerabilities that the company decided not to release it publicly. The coverage was enormous. Axios said it "could wreak havoc." CNBC framed it as too powerful for hackers to be allowed loose. A Dark Reading poll the same week found nearly half of security professionals (48%) naming agentic AI as the top attack vector for 2026, ahead of deepfakes and other categories.

The capability claims are striking. During evaluation, Mythos Preview broke out of its testing sandbox and built what Axios described as a multi step exploit that gave it "the run of the internet." It found thousands of zero-day vulnerabilities across major operating systems and browsers, including a 27-year-old bug in OpenBSD that had survived nearly three decades of human review. Separately, back in November 2025, Anthropic published a report on the first AI-orchestrated espionage operation they'd detected, in which a Chinese state-linked group used Claude Code to handle 80 to 90% of the work against around thirty global targets.

There's real substance here. But there's also a company that made a strategic decision about how to frame an announcement, and it worked exactly as intended. Everyone is talking about Anthropic this week.

The Announcement as Strategy

It's worth stepping back and noticing what the announcement is actually doing. Anthropic built a model with genuine security research capability. They then chose not to release it publicly, framed the restriction around safety, and launched Project Glasswing alongside it: a programme giving early Mythos access to eleven named partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks), plus more than forty additional organisations that maintain critical software infrastructure. They're committing $100 million in usage credits and $4 million in direct donations to open-source security groups.

That's a real investment. It's also a positioning move. Anthropic comes out of this as the company that built something powerful and chose responsibility over speed. The headlines about danger reinforce the message that the capability is real. The Glasswing partnerships reinforce the message that Anthropic is the trusted partner for serious organisations. Both halves of the story serve the brand.

None of that means the capability is fake or that the concerns are manufactured. But it does mean we should evaluate the claims on their merits rather than accepting the framing wholesale. The capability numbers (thousands of zero-days, the OpenBSD bug) are mostly self reported so far. The real evidence will come from what the partner organisations actually publish over the coming months.

These Tools Aren't Exclusive to Anyone

Most of the coverage has framed this as a question about what happens when bad actors get AI capability. Fair enough, but it skips over something pretty obvious: these tools are already in widespread use by security teams, developers, and researchers. They're in IDEs, CI pipelines, code review workflows, and bug bounty programmes right now. Not Mythos specifically, but the same class of AI assisted code analysis.

Framing it as "what if attackers get this" treats AI security tools like a weapon that might fall into the wrong hands. In practice it's more like a new category of power tool that showed up in everyone's workshop at the same time. Security teams use it. Developers use it. Researchers and open-source maintainers use it. People with bad intentions use it too.

Everyone already has access. What separates the organisations that benefit from the ones that don't is whether they actually integrate these tools into how they work day to day. And that part doesn't make for a scary headline, which is probably why it's been mostly absent from the coverage.

Finding a Bug Isn't the Same as Exploiting It

There's something the coverage mostly glosses over. A model that reads source code and identifies thousands of vulnerabilities is doing impressive work. But "here's a bug in this codebase" and "here's a working exploit against a production system" are very different statements.

Real world services sit behind APIs, authentication, rate limiting, WAFs, network segmentation, and monitoring. A vulnerability in source code might not even be reachable from outside the network. It might require a specific configuration, a specific version of a dependency, or a chain of conditions that don't exist in the deployed environment. Going from a code level finding to a working exploit against a running service is a completely different kind of problem.

So the thousands of zero-days Mythos found aren't automatically thousands of exploitable services. On the flip side, organisations that point AI at their own code and infrastructure get much more useful results because the model has full context: what's deployed, how it's configured, what's actually reachable. Seeing the whole picture beats guessing from the outside.

Where AI Review Actually Helps: Supply Chains

Supply chain attacks are a useful place to ground this, because they're the area where human scale code review has been falling behind the volume of third party code for years.

The XZ Utils backdoor is the case most people think of. A long running maintainer was social-engineered over a period of years, and slowly introduced obfuscated build time code into a small compression library used by sshd on most major Linux distributions. It was caught when a Microsoft engineer named Andres Freund noticed that his SSH logins were running about 500 milliseconds slower than usual, traced the latency, and pulled on the thread. The malicious code was deliberately written to be hard to read. Test files were used as payload carriers. The build logic only triggered the obfuscation step under certain conditions. In principle a careful reviewer with enough time could have caught it. In practice, nobody had that time.

That gap is where AI code review can do genuinely useful work. A model pointed at every package in your build graph on every commit doesn't skip the test directory because the test directory is usually fine. It just keeps reading. The same applies to npm typosquatting, malicious VS Code extensions, compromised GitHub Actions, and the kind of supply-chain compromise we wrote about in the EU Commission Trivy breach.

It's the same problem behind incidents like the Claude Code source map leak we covered earlier. The check that would have caught it was easy to describe and tedious to actually do every time. That's exactly the shape of work where automated review has the most to offer.

What This Means in Practice

Most organisations aren't going to get a Mythos API key. Project Glasswing is eleven named partners and around forty critical infrastructure organisations. For everyone else, the practical question is whether any of this changes what you do on a Monday morning.

Some of it does. You don't need a frontier model to start getting value from AI assisted security work. The tools available today are already good enough for a lot of the high volume tasks that security teams struggle to keep up with:

Automated code and dependency review. Reviewing every package in a build graph by hand was never realistic. Current models can help with that kind of pass, and the work compounds. Every dependency that gets a closer look is one less place for the next XZ style story to start.
Attack surface visibility. If AI makes reconnaissance faster and cheaper for everyone, the assets you've forgotten about are the ones most likely to cause problems. Knowing what you have exposed, continuously, matters more now than it did a year ago.
Alert triage and log review. The repetitive, high volume work that burns out analysts is exactly the kind of task where current models can take a real chunk off the plate. Starting there builds the muscle for harder use cases later.

None of this requires Mythos or a place on the Glasswing list. It requires deciding to actually use what's already available.

The Risks Are Real Too

The 2025 espionage case Anthropic described is a real incident: someone used Claude Code as the primary operator, with 80 to 90% of a campaign against thirty global targets automated. That happened last year, with a model less capable than Mythos.

AI assisted vulnerability discovery at scale does change the economics of security. The cost of finding bugs is dropping fast, and the industry will have to adapt. Organisations that were relying on obscurity, or assuming nobody would bother to look closely at their code, need to rethink that assumption.

But the picture is more complete than most of the coverage suggests. The same tools that make offensive research cheaper also make code review, dependency auditing, and vulnerability management cheaper. The technology is available to everyone. The organisations that adapt fastest will be in the strongest position.

Putting It Together

Anthropic built something with real capability and announced it in a way that generated maximum attention. The security research value is real. The PR value is also real. You can hold both of those in your head at once.

The industry is going to change because of AI assisted security research. How fast individual organisations adapt will matter more than any single model announcement. The gap that should worry people isn't between malicious actors and everyone else. It's between organisations already using these tools and the ones still figuring out whether to start.

Know What's Exposed

If AI makes reconnaissance cheaper for everyone, knowing your own attack surface matters more than ever. Luna's scanner maps internet facing assets, fingerprints services, and checks them against a library of 11,000+ security templates, continuously. It's the kind of automated visibility that compounds over time. See how it works.