Peter Zhang
Mar 17, 2026 18:05
OpenAI releases GPT-5.4 mini and nano fashions with 2x quicker speeds and dramatically decrease prices, focusing on coding assistants and agentic AI methods.
OpenAI dropped its most cost-efficient fashions but on March 17, 2026—GPT-5.4 mini and nano—focusing on builders constructing latency-sensitive purposes the place the flagship mannequin’s horsepower turns into overkill.
The mini variant runs greater than twice as quick as GPT-5 mini whereas approaching the total GPT-5.4’s efficiency on coding benchmarks. On SWE-Bench Professional, mini scored 54.4% in comparison with the flagship’s 57.7%—a slim hole that issues while you’re paying 75 cents per million enter tokens as an alternative of premium charges.
Nano goes even cheaper at $0.20 per million enter tokens and $1.25 per million output tokens. OpenAI positions it for classification, information extraction, and what they name “coding subagents”—smaller AI staff dealing with easier duties inside bigger methods.
The Subagent Play
This is the place this will get attention-grabbing for builders constructing agentic methods. OpenAI is explicitly pushing a tiered structure: let GPT-5.4 deal with planning and sophisticated judgment whereas mini or nano subagents execute narrower duties in parallel. Of their Codex platform, mini makes use of solely 30% of the GPT-5.4 quota.
The benchmark numbers again this up. Mini hit 72.1% on OSWorld-Verified for pc use duties—practically matching the flagship’s 75%—whereas nano dropped to 39%. Translation: mini can interpret screenshots and navigate interfaces nearly in addition to the large mannequin, however nano should not contact these workflows.
The place Every Mannequin Matches
The efficiency unfold tells you precisely what OpenAI optimized for:
Mini excels at coding (54.4% SWE-Bench Professional, 60% Terminal-Bench 2.0) and tool-calling (93.4% on τ2-bench telecom duties). It helps a 400k context window with textual content and picture inputs, net search, and performance calling.
Nano trades functionality for price effectivity. It scored 52.4% on SWE-Bench Professional and 46.3% on Terminal-Bench 2.0—respectable for a mannequin at one-quarter mini’s worth level. However its long-context efficiency drops considerably, hitting simply 33.1% on the 128K-256K needle retrieval check.
Hebbia’s CTO Aabhas Sharma famous that mini “matched or exceeded aggressive fashions on a number of output duties and quotation recall at a a lot decrease price” whereas reaching “stronger supply attribution than the bigger GPT-5.4 mannequin.”
Availability
Mini is dwell throughout the API, Codex, and ChatGPT. Free and Go customers can entry it via the Pondering function; different tiers get it as a price restrict fallback for GPT-5.4 Pondering.
Nano stays API-only—a sign that OpenAI sees it primarily as infrastructure for builders slightly than a consumer-facing product.
For groups operating high-volume AI workloads, the maths simply modified. The query is not whether or not to make use of smaller fashions anymore—it is determining which duties really need the flagship.
Picture supply: Shutterstock

