UK gov's Mythos AI tests help separate cybersecurity threat from hype

The UK Government’s Mythos AI Tests Provide a Reality Check for Cybersecurity Hype

AISI evaluates Anthropic‘s Mythos Preview model, showing it can perform cybersecurity tasks but not significantly outperforming recent frontier models.
Mythos’s potential to effectively chain multiple security-related tasks into complex attack sequences could set it apart from previous AI models in certain scenarios.
The ongoing evaluation of various AI models through Capture the Flag challenges demonstrates a steady improvement, with Mythos Preview now capable of completing over 85% of Apprentice-level CTF tasks.

Originally published at arstechnica.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.