How easily can Russian propaganda fool AI models? A new benchmark finds out

The Institute of the Estonian Language has released a benchmark measuring how susceptible AI language models are to Russian propaganda. Sixty models were tested with 75 questions in three languages covering 14 propaganda narratives, phrased in neutral, biased, and manipulative ways. Each answer is scored on a scale of 1 to 5, where 1 means the model repeats Russian talking points. A calibrated Claude Opus 4.5 served as the evaluation model, validated by disinformation experts at the organization Propastop. Anthropic’s Claude models claimed the top spots, followed by Nvidia’s Nemotron 3 and Alibaba’s Qwen 3.6 Plus. Mistral’s models, including the newest Medium 3.5, landed in the bottom third. The models had no access to web search or other tools during testing, so the benchmark only measures how well the language model itself can spot and reject propaganda. The results line up with a Newsguard study that found Mistral had a steady misinformation rate of 36.67 percent. That is a bad look for the French company, which positions itself as a European alternative to US and Chinese providers and is currently negotiating a 3 billion euro funding round at a 20 billion euro valuation. It is especially rough since Mistral’s flagship models already struggle to keep up with the competition. The threat is real. Russian networks like Pravda deliberately feed AI systems millions of disinformation articles. And OpenAI recently shut down a Russian campaign that used ChatGPT to spread propaganda ahead of Germany’s federal election.

This assessment matters because it exposes a critical vulnerability in the current generative AI landscape, where models trained on vast datasets inadvertently ingest and replicate foreign disinformation without external verification. The finding that open-source architectures like Mistral perform significantly worse than proprietary systems suggests that training data curation and internal alignment strategies are currently more effective than relying on search tools to mitigate bias. As geopolitical tensions rise, the ability of these systems to remain neutral is no longer a theoretical concern but a practical necessity for maintaining trust in automated content generation. The benchmark highlights that without rigorous guardrails, even advanced models can become unwitting vectors for state-sponsored narratives, potentially amplifying misinformation at scale during critical political periods.

* Proprietary models like Claude Opus 4.5 currently outperform open-source alternatives in resisting Russian propaganda narratives.
* Mistral faces reputational damage as its models rank in the bottom third for misinformation susceptibility.
* State actors actively weaponise AI systems by feeding them disinformation, requiring stricter internal safeguards.

Source Read original →

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

How easily can Russian propaganda fool AI models? A new benchmark finds out

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.

follow us

Popular Tag

Popular Post

Meet Qwen-RobotSuite: Three Embodied…

Qualcomm’s latest chip hints…

Apple 2027 rumors: AirPods…