Caroline Bishop
Feb 17, 2026 18:34
Claude’s new dynamic filtering function cuts enter tokens by 24% whereas bettering search accuracy. Opus 4.6 hits 61.6% on BrowseComp benchmark.
Anthropic has rolled out a major improve to Claude’s net search capabilities, with the AI assistant now writing and executing code on the fly to filter search outcomes earlier than processing them. The advance delivers a mean 11% accuracy achieve whereas consuming 24% fewer enter tokens, in line with the corporate’s inner benchmarks.
The replace, launched alongside Claude Opus 4.6 and Sonnet 4.6, addresses a persistent problem in AI-powered net search: context window bloat. Conventional search instruments pull total HTML recordsdata into reminiscence, a lot of it irrelevant noise that degrades response high quality and burns by tokens.
How Dynamic Filtering Works
Slightly than reasoning over uncooked HTML dumps, Claude now dynamically generates code to post-process question outcomes. The system retains related information and discards the remainder earlier than something hits the context window. Consider it because the AI constructing its personal customized search scraper in real-time.
Anthropic examined the method on two trade benchmarks. On BrowseComp—which measures an agent’s potential to search out intentionally hard-to-find info throughout a number of web sites—Opus 4.6 jumped from 45.3% to 61.6% accuracy. Sonnet 4.6 climbed from 33.3% to 46.6%.
DeepsearchQA, which checks systematic multi-step analysis with many right solutions, confirmed comparable positive aspects. Opus 4.6’s F1 rating rose from 69.8% to 77.3%, whereas Sonnet 4.6 improved from 52.6% to 59.4%.
Actual-World Validation
Quora’s Poe platform, which serves thousands and thousands of customers throughout 200+ AI fashions, has already examined the improve internally. “The mannequin behaves like an precise researcher, writing Python to parse, filter, and cross-reference outcomes relatively than reasoning over uncooked HTML in context,” mentioned Gareth Jones, the corporate’s Product and Analysis Lead. Quora discovered Opus 4.6 with dynamic filtering achieved the very best accuracy towards different frontier fashions on their inner evaluations.
Token Economics Get Sophisticated
Value implications differ by use case. Value-weighted tokens decreased for Sonnet 4.6 throughout each benchmarks, however truly elevated for Opus 4.6—the extra highly effective mannequin generally writes extra complicated filtering code. Anthropic recommends builders benchmark towards their particular question patterns earlier than deployment.
Dynamic filtering ships enabled by default for the brand new net search and net fetch instruments on the Claude API. The corporate additionally graduated a number of associated instruments to basic availability: code execution sandboxes, persistent reminiscence throughout conversations, programmatic software calling, and dynamic software discovery.
For builders constructing search-heavy functions—suppose analysis assistants, quotation verification instruments, or aggressive intelligence bots—the improve may meaningfully minimize operational prices whereas bettering output high quality. The API documentation is dwell now on Claude’s developer platform.
Picture supply: Shutterstock

