Briefly
- OpenAI launched GPT-5.4 amid the rising QuitGPT backlash over its Pentagon AI contract.
- GPT-5.4 provides a 1-million-token context window, stronger reasoning, and agentic capabilities.
- Enterprise customers profit most as GPT-5.4 delivers sooner AI brokers with fewer tokens.
OpenAI started rolling out GPT-5.4—its most succesful mannequin thus far—on Thursday as the corporate scrambles to comprise a PR disaster that has seen an estimated 2.5 million customers take actions in opposition to the corporate, both by canceling their subscription or sharing the boycott on social media.
The so-called QuitGPT motion exploded after OpenAI revealed a cope with the U.S. Division of Protection hours after Anthropic publicly walked away from the identical contract—incomes the Claude maker the general public scorn of President Trump and different authorities officers.
Anthropic’s sticking level: The DoD refused to incorporate language explicitly prohibiting the deployment of autonomous weapons and mass surveillance of U.S. residents.
OpenAI took the deal anyway. CEO Sam Altman, who has been fielding questions concerning the obvious hole between his firm’s said security pink traces and the contract’s precise language, wants these customers again.
Enter GPT-5.4… simply two days after GPT-5.3 was launched.
The brand new mannequin consolidates reasoning, coding, and agentic capabilities right into a single launch. It additionally has 1,000,000 tokens of context functionality, which interprets in customers having extra freedom to deal with massive quantities of knowledge in a single session.
On paper, the numbers look promising. On GDPval—a benchmark testing data work throughout 44 occupations—GPT-5.4 matches or beats business professionals in 83.0% of comparisons, up from 70.9% for GPT-5.2. Pc use is the largest leap: On OSWorld-Verified, which measures a mannequin’s capability to function a desktop by means of screenshots and keyboard/mouse actions, GPT-5.4 hits a 75.0% success price versus GPT-5.2’s 47.3%—and clears the human baseline of 72.4%.
On BrowseComp, a check of deep net analysis, it jumps 17 proportion factors over GPT-5.2. The 1 million token context window and a mid-response steering characteristic—letting customers redirect the mannequin whereas it is nonetheless pondering—spherical out the headline options.
The characteristic saves time and computation by avoiding the necessity to discard all beforehand generated tokens when an error is detected.
Who will profit from GPT 5.4?
It’s vital to notice that some benchmarks principally examine GPT-5.4—and more often than not, reasoning was set to additional excessive effort, which free and Plus customers don’t get to get pleasure from—to GPT-5.2, skipping over GPT-5.3 totally.
For customers already on GPT-5.3, a number of features could really feel extra incremental than the charts counsel.

Coders have essentially the most motive to mood expectations: On SWE-Bench Professional, the advance from GPT-5.3-Codex (56.8%) to GPT-5.4 (57.7%) is barely a rounding error. The mannequin additionally claims considerably fewer tokens are required to finish duties in comparison with GPT-5.2.
“GPT‑5.4 is our most token-efficient reasoning mannequin but, utilizing considerably fewer tokens to resolve issues when in comparison with GPT‑5.2”, OpenAI stated.
That stated, any enchancment on this discipline is a optimistic for builders who use OpenAI fashions by way of API and get charged per token used. A mannequin with an environment friendly chain of thought could present the identical outcomes at a fraction of the price, versus a mannequin that tends to overthink issues to make sure it reaches the correct conclusion.
There’s one other wrinkle for anybody hoping to make use of the brand new mannequin proper now: OpenAI says GPT-5.4 shall be launched right now, nevertheless it wasn’t but accessible as of this writing, so it’s doubtless being slowly rolled out. For many customers, the perfect mannequin is GPT 5.3, and it might probably solely be used for immediate replies, which means it supplies solutions that don’t require an excessive amount of effort.
Customers who depend on pondering—OpenAI’s terminology for prolonged chain-of-thought reasoning on complicated duties—are nonetheless on GPT-5.2. In different phrases, the customers probably to push the mannequin’s limits are the final ones to get it.

The clearest beneficiaries are enterprise customers doing document-heavy work. On an inner spreadsheet modeling benchmark, GPT-5.4 scored 87.3% in opposition to GPT-5.2’s 68.4%. Authorized analysis agency Harvey stated it scored 91% on its BigLaw Bench eval. Mainstay, which runs brokers throughout 30,000 property tax portals, reported a 95% first-attempt success price and periods working “~3x sooner whereas utilizing ~70% fewer tokens.”
That is the type of effectivity argument that may matter to enterprise procurement groups—nevertheless it’s a more durable promote to the person consumer reconsidering whether or not to delete their account.
Each day Debrief Publication
Begin daily with the highest information tales proper now, plus unique options, a podcast, movies and extra.
