UK gov's Mythos AI tests help separate cybersecurity threat from hype

The UK Government’s Mythos AI Tests Provide Independent Verification of Cybersecurity Capabilities

The UK government’s AI Security Institute (AISI) has conducted an initial evaluation of Anthropic’s Mythos Preview model, adding independent public verification to earlier reports by Anthropic.
While the AISI found that Mythos isn’t significantly different from other recent frontier models in tests of individual cybersecurity-related tasks, it highlights Mythos’ ability to effectively chain these tasks into multistep series required for full system infiltration.
The AISI’s evaluation follows a series of Capture the Flag challenges using specially designed cybersecurity assessments since early 2023, where performance has steadily improved with each subsequent model release, culminating in Mythos Preview completing over 85 percent of Apprentice-level tasks.

Originally published at arstechnica.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.