Zach Anderson
Apr 10, 2026 23:18
Anthropic releases safety tips as Undertaking Glasswing reveals frontier AI fashions can now discover and exploit vulnerabilities sooner than human defenders.

Anthropic dropped a sobering evaluation this week: inside two years, AI fashions will uncover huge numbers of software program vulnerabilities which have sat unnoticed in code for years—and chain them into working exploits. The corporate’s safety groups launched detailed defensive suggestions alongside Undertaking Glasswing, their initiative to deploy Claude Mythos Preview’s capabilities for cyber protection.
The maths right here is not difficult. If attackers can use frontier fashions to automate vulnerability discovery and exploit era, the window between a patch dropping and a working exploit showing shrinks dramatically. Anthropic’s safety engineers have watched this occur in their very own testing.
What Their Analysis Really Discovered
In line with Anthropic’s technical findings, AI fashions excel at recognizing signatures of recognized vulnerabilities in unpatched techniques. Reversing a patch right into a working exploit—precisely the form of mechanical evaluation these fashions deal with effectively—used to require specialised expertise. Now it is turning into automated.
The corporate famous that publicly obtainable fashions under Mythos functionality ranges can already discover critical vulnerabilities that conventional code opinions missed for prolonged durations. Mozilla Firefox vulnerabilities found by means of AI scanning function one documented instance.
The Defensive Playbook
Anthropic’s suggestions prioritize controls that maintain even towards attackers with limitless endurance and AI help. Friction-based safety measures—additional pivot hops, charge limits, non-standard ports—lose effectiveness when adversaries can grind by means of tedious steps robotically.
Their high priorities:
Patch velocity issues greater than ever. Web-facing purposes ought to obtain patches inside 24 hours of an exploit turning into obtainable. The CISA Recognized Exploited Vulnerabilities catalog needs to be handled as an emergency queue. Anthropic recommends utilizing EPSS (Exploit Prediction Scoring System) for prioritizing all the things else.
Put together for 10x vulnerability report quantity. Over the following two years, consumption and triage processes will face strain they’ve by no means skilled. Organizations nonetheless working weekly spreadsheet conferences will not maintain tempo.
Scan your personal code with frontier fashions earlier than attackers do. This was Anthropic’s single most emphasised advice. Legacy code that predates present evaluate practices—particularly code whose authentic authors have moved on—represents the highest-value goal for proactive scanning.
Zero Belief Will get Actual
The steerage pushes onerous towards hardware-bound credentials and identity-based service isolation. A compromised construct server should not attain manufacturing databases. A compromised laptop computer should not contact construct infrastructure.
Static API keys, embedded credentials, and shared service-account passwords are described as “among the many first issues an attacker with model-assisted code evaluation will discover.”
For Smaller Operations
Organizations with out devoted safety groups received particular recommendation: allow computerized updates in all places, desire managed companies over self-hosting, use passkeys or {hardware} safety keys, and activate free safety tooling from code hosts like GitHub’s Dependabot and CodeQL.
Open-source maintainers ought to count on elevated vulnerability report quantity—some precious, some automated noise. Publishing a SECURITY.md with clear consumption processes helps separate sign from spam.
Anthropic dedicated to updating this steerage as Undertaking Glasswing progresses. For enterprises monitoring SOC 2 and ISO 27001 compliance, most suggestions map on to present controls. The distinction now’s urgency.
Picture supply: Shutterstock
