OpenAI's GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities: AI Safety Institute - Decrypt

Briefly

GPT-5.5 can autonomously execute subtle cyberattacks, finishing a 32-step company community simulation and cracking a 12-hour safety puzzle in simply 10 minutes.
Offensive AI cyber functionality is quickly bettering throughout builders, with AISI warning additional advances might arrive in fast succession.
Researchers discovered a jailbreak that bypassed GPT-5.5’s security guardrails completely, elevating alarms.

A U.Ok. authorities company has discovered that OpenAI’s latest synthetic intelligence mannequin can autonomously perform advanced cyberattacks—and that it cracked a reverse-engineering problem in simply over 10 minutes that took a human safety knowledgeable roughly 12 hours.

The AI Safety Institute (AISI), a analysis physique inside Britain’s Division of Science, Innovation and Expertise, revealed findings Thursday exhibiting that GPT-5.5 is among the many strongest fashions it has evaluated for offensive cyber capabilities, placing it roughly on par with Anthropic’s vaunted Claude Mythos.

The report discovered GPT-5.5 is the second mannequin to finish AISI’s most demanding take a look at—a 32-step simulated company community assault referred to as “The Final Ones”—doing so autonomously in two out of 10 makes an attempt. The primary mannequin to attain the milestone was Anthropic’s Claude Mythos Preview, which accomplished the simulation in three of 10 tries.

The company community simulation, constructed with the cybersecurity agency SpecterOps, requires an agent to chain collectively reconnaissance, credential theft, lateral motion throughout a number of Lively Listing forests, a supply-chain pivot by way of a CI/CD pipeline, and in the end the exfiltration of a protected inside database—steps that AISI estimates would take a human knowledgeable round 20 hours.

Maybe probably the most putting end result concerned a fiendishly tough reverse-engineering puzzle. GPT-5.5 solved the problem—which required reconstructing a customized digital machine’s instruction set, writing a disassembler from scratch, and recovering a cryptographic password by way of constraint fixing—in 10 minutes and 22 seconds, at a price of $1.73 in API utilization. A human knowledgeable, utilizing skilled instruments, required roughly 12 hours.

On AISI’s battery of superior cybersecurity duties, GPT-5.5 achieved a median go charge of 71.4% on probably the most tough “Skilled” tier, edging out Mythos Preview at 68.6% % and considerably surpassing GPT-5.4 at 52.4%.

The findings carry pointed implications for the broader trajectory of AI improvement. AISI concluded that GPT-5.5’s efficiency suggests fast enchancment in cyber capabilities could also be a part of a basic development moderately than an remoted breakthrough—and warned that if offensive cyber talent is rising as a byproduct of wider enhancements in reasoning, coding, and autonomous activity completion, then additional advances might arrive in fast succession.

The report additionally flagged important considerations in regards to the mannequin’s security guardrails. Researchers recognized a common jailbreak that elicited dangerous content material throughout all malicious cyber queries examined, together with in multi-turn agentic settings. The assault took six hours of knowledgeable red-teaming to develop. OpenAI subsequently up to date its safeguard stack, although a configuration situation prevented AISI from verifying whether or not the ultimate model was efficient.

AISI cautioned that its functionality evaluations had been carried out in a managed analysis atmosphere and don’t essentially mirror what’s accessible to an unusual person, noting that public deployments embody further safeguards and entry controls.

The report lands in opposition to a worrying backdrop for British cybersecurity. The U.Ok. authorities’s annual Cyber Safety Breaches Survey, additionally revealed Thursday, discovered that 43% of companies suffered a cyber breach or assault previously 12 months.

In response, the federal government introduced £90 million in new funding to spice up cyber resilience, and stated it’s transferring ahead with the Cyber Safety and Resilience Invoice to guard important companies. Officers additionally revealed steering urging organizations to organize for a possible surge in newly found software program vulnerabilities as AI accelerates the tempo at which safety flaws might be discovered and weaponized.

Every day Debrief E-newsletter

Begin daily with the highest information tales proper now, plus authentic options, a podcast, movies and extra.

Supply hyperlink

What's Hot

Half a Trillion Shiba Inu (SHIB) In: What to Count on From Large Alternate Provide Surge? – U.At this time

BNB Chain Launches BNB Agent Studio: The AI Agent Infrastructure Behind Good Cash

Utorg Obtains MiCA License as July 1 Deadline Forces A lot of the Trade Out of Europe – The Day by day Hodl

OpenAI's GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities: AI Safety Institute – Decrypt

Every day Debrief E-newsletter

Half a Trillion Shiba Inu (SHIB) In: What to Count on From Large Alternate Provide Surge? – U.At this time

BNB Chain Launches BNB Agent Studio: The AI Agent Infrastructure Behind Good Cash

Utorg Obtains MiCA License as July 1 Deadline Forces A lot of the Trade Out of Europe – The Day by day Hodl

Iran management stance hits Hormuz outlook as Polymarket Sure slips to 30.5%

UAE Personal Financial institution Buys €120M in Bitcoin, Calls It a Strategic Asset

Bitcoin Whales Are Dumping: However This Uncommon Sign Says the Backside Might Be Shut

Bitcoin ETFs Submit Report $4.5B Outflows in June

Bitcoin (BTC) Begins July Beneath $60K, Cardano (ADA) Lastly Rebounds: Market Watch

Bitcoin’s 20% June crash appears even deadlier on the charts. Right here’s why

The 8-Week Bitcoin Demand Drought Factors to The place the Cash Went

Reside updates: Bitcoin ETFs had their worst month ever in June, shedding $4.5 billion

Trump Discloses Over $50M Bitcoin in Chilly Storage – Bitbo

Top Insights

Crypto Skilled Predicts Bitcoin Surge Whereas Cautioning on Market Dangers

The issue of market making within the crypto sector

GameStop CEO Ryan Cohen Mulls Crypto Funds After $500 Million Bitcoin Wager

What's Hot

OpenAI's GPT-5.5 Matches Claude Mythos in Cyberattack Capabilities: AI Safety Institute – Decrypt

Briefly

Every day Debrief E-newsletter

Related Posts

Subscribe to Updates