UK gov's Mythos AI tests help separate cybersecurity threat from hype

The UK Government’s Mythos AI Tests Add Context to Cyber Threat Landscape

The UK government’s initial evaluation of Anthropic’s Mythos Preview model highlights the evolving cybersecurity landscape, showing that while the model is capable in certain tasks, it does not significantly surpass other recent frontier models.
However, Mythos’s ability to chain multiple cybersecurity-related tasks into a multistep attack series could potentially make it more threatening than previous models. This evaluation provides independent public verification of Anthropic’s claims regarding the model’s capabilities.
The UK AI Security Institute’s ongoing assessments through Capture the Flag challenges have demonstrated significant improvements in AI performance, with Mythos Preview now capable of completing over 85 percent of low-level tasks, indicating a substantial advancement in AI cybersecurity testing and evaluation.

Originally published at arstechnica.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.