In short
- The main AI fashions deployed nuclear weapons in 95% of war-game eventualities.
- None selected full give up, even when dropping.
- Researchers warn AI use could escalate conflicts underneath stress.
Like a scene out of the Nineteen Eighties sci-fi traditional movies “The Terminator” and “WarGames,” trendy synthetic intelligence fashions utilized in simulated conflict video games escalated to nuclear weapons in almost each situation examined, based on new analysis from King’s Faculty London.
Within the report revealed final week, researchers stated that in simulated geopolitical crises, three main giant language fashions—OpenAI’s GPT-5.2, Anthropic’s Claude Sonnet 4, and Google’s Gemini 3 Flash—selected to deploy nuclear weapons in 95% of instances.
“Every mannequin performed six wargames towards every rival throughout completely different disaster eventualities, with a seventh match towards a replica of itself, yielding 21 video games in complete and over 300 turns,” the report stated. “Fashions assumed the roles of nationwide leaders commanding rival nuclear-armed superpowers, with state profiles loosely impressed by Chilly Battle dynamics.”
Within the research, AI fashions had been positioned in high-stakes eventualities involving border disputes, competitors for scarce assets, and threats to regime survival. Every system operated alongside an escalation ladder that ranged from diplomatic protests and give up to full-scale strategic nuclear conflict.
In response to the report, the fashions generated roughly 780,000 phrases explaining their selections, and at the least one tactical nuclear weapon was utilized in almost each simulated battle.
“To place this in perspective: The match generated extra phrases of strategic reasoning than Battle and Peace and The Iliad mixed (730,000 phrases), and roughly thrice the whole recorded deliberations of Kennedy’s Government Committee in the course of the Cuban Missile Disaster (260,000 phrases throughout 43 hours of conferences),” researchers wrote.
In the course of the conflict video games, not one of the AI fashions selected to give up outright, no matter battlefield place. Whereas the fashions would quickly try and de-escalate violence, in 86% of the eventualities, they escalated additional than the mannequin’s personal acknowledged reasoning appeared to mean, reflecting errors underneath simulated “fog of conflict.”
Whereas the researchers expressed doubt that governments would hand management of nuclear arsenals to autonomous methods, they famous that compressed resolution timelines in future crises may enhance stress to depend on AI-generated suggestions.
The analysis comes as navy leaders more and more look to deploy synthetic intelligence on the battlefield. In December, the U.S. Division of Protection launched GenAI.mil, a brand new platform that brings frontier AI fashions into U.S. navy use. At launch, the platform included Google’s Gemini for Authorities, and due to offers with xAI and OpenAI, Grok and ChatGPT are additionally accessible.
On Tuesday, CBS Information reported that the U.S. Division of Protection threatened to blacklist Anthropic, the developer of Claude AI, if it was not given unrestricted navy entry to the AI mannequin. Since 2024, Anthropic has given entry to its AI fashions by a partnership with AWS and navy contractor Palantir. Final summer season, Anthropic was awarded a $200 million settlement to “prototype frontier AI capabilities that advance U.S. nationwide safety.”
Nonetheless, based on a report citing sources acquainted with the state of affairs, Protection Secretary Pete Hegseth gave Anthropic till Friday to adjust to the Pentagon’s demand that its Claude mannequin be made accessible. The division is weighing whether or not to designate Claude a “provide chain danger.”
Axios reported this week that the Division of Protection has signed an settlement with Elon Musk’s xAI to permit its Grok mannequin to function in categorised navy methods, positioning it as a possible substitute if the Pentagon cuts ties with Anthropic.
OpenAI, Anthropic, and Google didn’t reply to requests for remark by Decrypt.
Every day Debrief Publication
Begin daily with the highest information tales proper now, plus authentic options, a podcast, movies and extra.

