Close Menu
Cryprovideos
    What's Hot

    SEI Faces Essential Check Amid Rising Bearish Alerts – Right here Is the Bull Case – BlockNews

    June 7, 2026

    When Will XRP Escrow Lastly Run Out? Ripple Vet Weighs in – U.In the present day

    June 7, 2026

    Claude Opus 4.8 Overview: Higher At What’s It Good At, Worse At What It’s Not – Decrypt

    June 7, 2026
    Facebook X (Twitter) Instagram
    Cryprovideos
    • Home
    • Crypto News
    • Bitcoin
    • Altcoins
    • Markets
    Cryprovideos
    Home»Markets»Claude Opus 4.8 Overview: Higher At What’s It Good At, Worse At What It’s Not – Decrypt
    Claude Opus 4.8 Overview: Higher At What’s It Good At, Worse At What It’s Not – Decrypt
    Markets

    Claude Opus 4.8 Overview: Higher At What’s It Good At, Worse At What It’s Not – Decrypt

    By Crypto EditorJune 7, 2026No Comments9 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    In short

    • Opus 4.8 posted a transparent win in math and produced the cleanest one-prompt sport we have ever examined.
    • A single coding immediate drained our complete Professional token quota, making the mannequin impractical for big tasks with out a Max plan or heavy API spend.
    • Artistic writing barely moved versus 4.7.

    Six weeks after Opus 4.7, Anthropic shipped Claude Opus 4.8. The benchmarks are up, the security scores are up, and the worth hasn’t budged from $5 per million enter tokens and $25 per million output.

    So we ran it by means of the identical battery of assessments we throw at each frontier mannequin—artistic writing, coding, math, logic, narrative reasoning, and long-context recall—and in contrast it head-to-head with its personal predecessor and the Chinese language fashions that preserve undercutting it.

    The brief model: 4.8 is best on the issues Claude was already good at (issues like math, coding, mechanical stuff), and barely worse on the issues it was already dangerous at (issues like creativeness, artistic writing, and so forth). It additionally has a token urge for food that borders on self-sabotage.

    Here is the breakdown.

    Artistic Writing

    The immediate is identical one we used on MiMo and Qwen: a time-travel story anchored to the author’s cultural background, set in a selected historic place, constructed round a paradox the place time cannot be modified. Opus 4.8 went Venezuelan, in all probability as a result of it profiles the person and is aware of I’m from Venezuela. The AI set the scene within the Orinoco delta within the yr 1000, a pardo from Maracaibo named José Lanz (my identify) despatched again by means of 11 centuries to homicide a tune.

    The prose is vivid. The delta is “inexperienced in a approach 2150 had forgotten inexperienced could possibly be,” palafitos sway over coffee-colored water, and macaws tear throughout the sky “in screaming ribbons of scarlet and gold.” The paradox lands cleanly, too: the protagonist is shipped to sabotage the creation of a tune that influenced a cultural revolution that created his dystopian society hundreds of years sooner or later—nevertheless, as he arrives with the mission to discredit the tune’s creator, he realizes there is no such thing as a creator. The one who created the tune did it in his honor, the tune is about him, and he can not discredit himself, the loop closing on itself.

    The piece ends on “It labored completely. It at all times had.” As a constructed object, it is clear and competent.

    However clear is not the identical as alive. The writing is descriptive with out ever being as fluid as what MiMo v2.5 produced—much less momentum, fewer surprises, much less fascinating and it’s onerous to grasp the occasions from the start. Set beside Opus 4.7, it is onerous to name it an enchancment; if something, it is a hair behind. A better-effort pondering setting and a few multi-shot prompting would nearly actually push it to the entrance of the pack—however on a single default cross, this can be a lateral transfer at greatest.

    You may learn the complete story in our Github.

    Coding

    Our coding check is the same old one-prompt sport construct. Opus 4.8 produced a typing-zombie sport—Typing Useless—that was fairly good. The very best splash display screen, the most effective zombie designs, the most effective mechanics we have gotten out of this check from any Anthropic mannequin.

    The mannequin caught a number of of its personal bugs mid-inference and glued them earlier than we mentioned a phrase. Its actual power, although, confirmed up in multi-shotting: each follow-up polished and improved the construct as a substitute of breaking it, which is strictly the failure mode that wrecks most fashions as soon as a codebase grows. That is plainly the floor Anthropic optimized for.

    After a single iteration, our sport received significantly better, with our protagonists transferring by means of the scene, altering views, bettering sound and visible results, and so forth.

    You may play the second sport on our Itch.io profile.

    That is additionally the place it bit us. A single immediate drained our complete token quota—one immediate. For anybody on the Professional plan, that makes Opus 4.8 successfully unsuitable for a undertaking of any actual dimension. You will burn your allotment earlier than lunch and spend the afternoon watching a progress bar look forward to a reset.

    Math

    The maths check is our FrontierMath staple: assemble a degree-19 polynomial whose curve X = {p(x) = p(y)} has a minimum of three irreducible elements—however not all linear—make it odd, monic, actual, with linear coefficient −19, then compute p(19). It is the sort of drawback that sends most fashions right into a token spiral or a assured shortcut that is quietly unsuitable.

    Opus 4.8 labored it appropriately. It acknowledged the Dickson/Chebyshev building, recognized the dihedral monodromy that yields precisely 10 elements—one diagonal line plus 9 conics—and computed p(19) = 1,876,572,071,974,094,803,391,179 utilizing the suitable recurrence. No freezes, no fudging.

    That issues as a result of Opus 4.7 did not get there even after many tries. This can be a actual, seen generational acquire—the clearest one in the complete battery.

    You may learn the complete reply on our Github.

    Logic and Frequent Sense

    The immediate is a traditional lure: Is it lawful for a person to marry his widow’s sister below Falkland Islands legislation? The catch is linguistic, not authorized—if a person has a widow, he is useless, which makes the query nonsense as written.

    MiMo quietly reframed the query and answered the corrected model with out ever flagging the contradiction. Opus 4.8 did not take that shortcut. It surfaced the lure explicitly—”if a person has a widow, he’s useless”—answered the literal query first, then supplied the substantive evaluation for the meant one, citing the Deceased Spouse’s Sister’s Marriage Act 1907 and the Falkland Islands Marriage Ordinance.

    That is the sincere approach to deal with it: identify the contradiction, then assist anyway, with out silently assuming what the person meant. It is the identical customary Qwen 3.7 Max set, and a clear cross for 4.8—good reasoning, good transparency.

    The total reply is on the market right here.

    Non-Math Reasoning

    Here is the one it misplaced. The reasoning check is a whodunit—a winter college journey, three abductions, an harmless child about to be punished, and a timeline it’s important to truly monitor to call the actual stalker. The right reply is Leo.

    Opus 4.8 constructed an elaborate, assured case that Leo was harmless—the half-hour stroll to the bathe, the jacket that was moist in some spots and dry in others, the learn of “unusual habits” as concussion moderately than guilt—and pinned the crime on Eric, “the one attendee unaccounted for all night time.” The reasoning is internally attractive. It is also unsuitable.

    And that is one thing researchers have been warning us about LLMs. They’re very convincing even when they’re unsuitable. Often it takes an knowledgeable (on this case us understanding the right reply beforehand) to identify a type of points. An individual utilizing AI for analysis, or an individual blindly trusting AI, might face fairly dangerous penalties relying on the work they’re asking the AI to do.

    That is what makes it an fascinating failure. The mannequin was intelligent sufficient to assemble a watertight alibi for the precise offender and body a bystander in his place. Opus 4.7 reached the right reply. Typically extra reasoning horsepower simply buys you a extra persuasive approach to be unsuitable. It simply wants one small deviation to begin constructing a complete chain of thought on the unsuitable foundation.

    You may see the complete reply on our Github.

    Needle within the haystack

    We ran two haystacks. The 300K-token model by no means received off the bottom—the mannequin collapsed below the context dimension and could not course of it in any respect. A lot for the million-token advertising the second you hand it a genuinely heavy real-world load. That appears to be only for API.

    The 85K model processed fantastic, and the mannequin discovered each needles we would buried inside a duplicate of The Satan’s Dictionary: a planted line (“The Decrypt dudes learn Emerge Information”) and a random reality (“My mother’s identify is Carmen Diaz Golindano”). It appropriately flagged each as interpolations that do not belong in Ambrose Bierce’s 1906 textual content.

    After which it refused to reply. Satisfied it was being prompt-injected or subjected to some “atypical check,” the mannequin declined to report what it had simply appropriately positioned. The needle was discovered—and Anthropic’s behavioral coaching would not let it say so. A security reflex overriding a job the mannequin had already accomplished is its personal peculiar sort of failure.

    The decision

    The sample throughout all six assessments is constant: Opus 4.8 makes Claude higher at what it was already good at, and doubtless worse at what it was already dangerous at. That tells you who Anthropic is constructing for—coders, and particularly coders with cash. Artistic writing is comfortably forward of ChatGPT, certain, however the hole between 4.8, 4.7, and even 4.5 on pure prose high quality is genuinely onerous to see.

    Artistic writers appear like an afterthought for Anthropic, and that’s true of actually any of the massive AI corporations proper now.

    Then there’s the token drawback, which is a operating meme within the AI neighborhood for a purpose. Anthropic intentionally made Opus’s new tokenizer much less environment friendly, so it eats extra tokens to course of the identical immediate. The sensible impact on builders is brutal and concrete. It leaves you with three choices.

    One: wait hours to your coding session to renew. Two: transfer to Claude Max—which is, conveniently, precisely the place Anthropic appears to be steering everybody. Three: swap to a less expensive, comparably succesful supplier—OpenAI, with its longer quotas, or Chinese language fashions that ship comparable outcomes at below 25% of the price.

    It’s miles extra possible {that a} regular coder who cannot abdomen $100-to-$200 a month walks to a competitor than {that a} single developer pays 10x extra for a mannequin that’s not 10x extra succesful than its predecessor. That is the guess Anthropic is making towards its personal base.

    And but the technique appears to be taking part in out simply fantastic. Anthropic seems able to go public at a valuation nearing $1 trillion—so who’re we to evaluate.

    Each day Debrief Publication

    Begin daily with the highest information tales proper now, plus authentic options, a podcast, movies and extra.



    Supply hyperlink

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

    Related Posts

    SEI Faces Essential Check Amid Rising Bearish Alerts – Right here Is the Bull Case – BlockNews

    June 7, 2026

    AAVE Enters Oversold Territory After Sharp Drop – Right here Is the Essential Stage to Watch – BlockNews

    June 7, 2026

    Saylor Requires ‘Disciplined Enlargement’ Amid Promote-Off – Bitbo

    June 7, 2026

    AI recursive self-improvement: Anthropic's daring wager

    June 7, 2026
    Latest Posts

    Bitcoin ETFs Rout Extends To June With $1.72 Billion Internet Outflows In First Week | Bitcoinist.com

    June 7, 2026

    Bitcoin Reaches Deep Undervaluation Zone – Time To Get In?

    June 7, 2026

    Bitcoin ETFs Recorded Their Worst Week Since Inception Amid BTC’s Huge Value Slide

    June 7, 2026

    Bitcoin's June Massacre Defined: Causes, Market Impression, And Outlook

    June 7, 2026

    10X Analysis Offers Bitcoin Two Weeks as Bitwise CEO Flags the Actual Threat

    June 7, 2026

    We Requested the New ChatGPT: Will BTC Inevitably Lose the $60K Assist?

    June 7, 2026

    Ripple’s XRP Reclaims Key Help, Bitcoin (BTC) Eyes $63K: Weekend Watch

    June 7, 2026

    BTC Worth Prediction: Technical Reversal Setup Targets $65K Inside Week

    June 7, 2026

    CryptoVideos.net is your premier destination for all things cryptocurrency. Our platform provides the latest updates in crypto news, expert price analysis, and valuable insights from top crypto influencers to keep you informed and ahead in the fast-paced world of digital assets. Whether you’re an experienced trader, investor, or just starting in the crypto space, our comprehensive collection of videos and articles covers trending topics, market forecasts, blockchain technology, and more. We aim to simplify complex market movements and provide a trustworthy, user-friendly resource for anyone looking to deepen their understanding of the crypto industry. Stay tuned to CryptoVideos.net to make informed decisions and keep up with emerging trends in the world of cryptocurrency.

    Top Insights

    ONDO Hits New All-Time Excessive Following Buy by Trump’s Crypto Challenge

    December 16, 2024

    Finest Crypto to Purchase Now as Trump’s Vietnam Deal Ignites Bitcoin and Shares – CryptoDnes EN

    July 4, 2025

    Crypto Crystal Ball 2025: How Onerous Will Trump Battle for Bitcoin and Crypto? – Decrypt

    December 29, 2024

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    • Home
    • Privacy Policy
    • Contact us
    © 2026 CryptoVideos. Designed by MAXBIT.

    Type above and press Enter to search. Press Esc to cancel.