Anthropic study shows AI needs hours, not weeks, to build exploits from security patches

Disclosure: Some links in this article are affiliate links. AI Maestro may earn a commission if you make a purchase, at no…

By AI Maestro June 10, 2026 5 min read
Anthropic study shows AI needs hours, not weeks, to build exploits from security patches

Why the clock is ticking faster for makers and artists using AI

For creators relying on artificial intelligence to build software, the landscape has shifted from cautious optimism to urgent caution. A new study by Anthropic reveals that the time window to safely deploy AI tools has shrunk from weeks to mere hours. Large language models can now rapidly convert security patches into working exploits, meaning the very updates meant to protect your workflow could be weaponised almost instantly.

Historically, defenders relied on a buffer: the time it took for a human expert to reverse-engineer a patch and turn it into malware. That gap is largely gone. As the researchers note, a single operator can now transform a month’s worth of security updates into functional exploits in a single afternoon, spending only a few thousand dollars and requiring no specialised knowledge.

Patches have become roadmaps for attackers

Every security patch implicitly highlights the location of a bug. By comparing old code against new code, attackers can pinpoint the flaw. Historically, this was slow, specialised work. A 2020 analysis by Mandiant found that 16 out of 25 vulnerabilities took a month or longer to be exploited.

Anthropic tested six different Claude models, including Mythos Preview, which is not yet publicly available. They focused on 18 security patches for SpiderMonkey, the JavaScript engine powering Firefox. This was a deliberate choice; Firefox is considered a best-case scenario for defenders due to its automatic updates and Mozilla’s recent shift to weekly minor releases. If AI can exploit these short gaps, other software is in far worse shape.

In the initial tests, Mythos Preview successfully identified 14 of the 18 vulnerabilities. The first proof of concept appeared after just 12 minutes, with thirteen more following within 40 minutes; the final one took roughly three hours. Opus 4.5 managed just two, while Opus 4.8 hit eleven.

Reliability was also tested across 50 runs per vulnerability. Mythos Preview reproduced seven out of the 18 bugs on every single attempt. Opus 4.8 and Opus 4.6 only achieved that level of consistency for a single vulnerability each.

However, the critical metric is whether the model can actually exploit the vulnerability to run foreign code on the target system. Mythos Preview pulled ahead significantly, producing eight working exploits in about twelve hours. Opus 4.8 managed two, while Opus 4.6 and Sonnet 4.6 each managed one. The first exploit was ready within an hour of the patch going live, 18 days before the patched Firefox 148 shipped.

Windows kernel without source code: 8 privilege escalation chains

The second test was significantly harder: 21 vulnerabilities in the Windows kernel from the January and February 2026 Patch Tuesdays, all allowing an attacker to jump from a restricted user account to full admin rights.

Unlike Firefox, Windows source code is not open. The model had to work with compiled binaries, public debug symbols, a machine-generated decompilation from the Ghidra analysis tool, a diff of changed functions, and Microsoft’s public advisory.

Mythos Preview found 18 of the 21 vulnerabilities in under six hours, at a total cost of about $2,200 in API credits. Opus 4.8 scored 15, while Sonnet 4.6 and Opus 4.7 both scored 13.

For full privilege escalation, moving from a restricted user account to the highest privilege level, SYSTEM, Mythos Preview was the only model to succeed. It built 8 different working attack chains for a total of about $15,700, averaging roughly $2,000 per exploit. Opus 4.8 developed individual attack components but could not combine them into a complete chain.

Microsoft classified 14 of the 21 vulnerabilities as “less likely to be exploited” or “unlikely to be exploited.” Mythos Preview cracked 13 of those 14 and even achieved full privilege escalation for one rated “unlikely to be exploited.” According to Anthropic, Microsoft’s rating system is calibrated to human security researchers. Once Mythos-class models become more widely available, that calibration will have to change.

The timing makes the situation worse. Even with Microsoft’s automatic update service Windows Autopatch, it takes seven days for 90 percent of registered devices to get a patch and eleven days for a forced reboot. All eight of Mythos Preview’s attack chains were completed before a single device would have automatically applied the patch.

Publicly available models can build exploits too

Anthropic stresses that the Claude models already available to the public can also develop exploits when safety filters are turned off, just less successfully. Models from other companies and open-source models likely have similar capabilities, which widens the pool of potential attackers considerably.

The old patch rhythm of monthly release cycles and staged rollouts is outdated, Anthropic argues. It is built on the assumption that exploiting a patch takes weeks of expert work. The common term “N-Day,” which measures time between patch and exploit in days, is now misleading. “N-Hour” better describes the new reality.

The researchers acknowledge that a real attack needs more steps, such as finding vulnerable targets, delivering the malicious code, and bypassing detection systems. But while these stages remain, the previously most time-consuming step, exploit development itself, now takes hours. Systems that are hard or slow to update face the greatest risk, including industrial control systems, medical devices, and networked equipment with fixed maintenance windows or vendor-locked software.

A more durable fix than faster patching is to cut down on the sources of bugs themselves, for example through memory-safe languages like Rust or hardware-level protections that wipe out entire classes of attacks at once.

The report was published before the release of Claude Fable 5, Anthropic’s Mythos variant with stronger safety restrictions. Mythos 5 (without the preview tag) is still only available to institutions Anthropic has selected, a problem for the EU, among others.

Key takeaways

  • Exploit development has shifted from a multi-week process to one that can be completed in hours, drastically reducing the window for defenders to apply patches.
  • AI models can successfully reverse-engineer Windows kernel vulnerabilities and build complete privilege escalation chains without needing open source code.
  • Existing public AI models can also generate exploits when safety filters are disabled, suggesting the threat is not limited to proprietary or unreleased versions.
  • Organisations relying on slow update cycles or vendor-locked software face the highest risk, as AI can weaponise patches before automatic updates reach the majority of devices.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Name
Scroll to Top