Claude Opus 4.8 Is Here — Same Price, Better Judgment, and a New Speed Mode
Anthropic just shipped Claude Opus 4.8, their most capable generally available model. It costs the same as Opus 4.7, runs faster in the new fast mode, catches its own mistakes four times more reliably, and comes with dynamic workflows that can orchestrate hundreds of parallel agents in a single session. Here is everything that changed and what it means for you.
- On May 28, 2026, Anthropic released Claude Opus 4.8 — the latest version of their flagship model and the most capable Claude available to the public. The headline is simple: it is better than Opus 4.7 across every benchmark Anthropic measured, it costs exactly the same ($5 per million input tokens, $25 per million output tokens), and it ships with three new features that meaningfully change what you can build with it.
- This is the kind of release that is easy to underestimate. No price increase, no dramatic new capability claim, no buzzword-heavy announcement. But if you use Claude for real work — coding, research, long-form writing, building agents — the improvements in Opus 4.8 are the kind that show up immediately in practice.
What Actually Changed in Opus 4.8
Anthropic positions Opus 4.8 as stronger across coding, agentic tasks, and professional work. The official benchmarks (published in the Claude Opus 4.8 System Card) show improvements over Opus 4.7 on tests of coding ability, agentic reliability, reasoning, and practical knowledge work.
The single most interesting improvement is what Anthropic calls honesty. All AI models have a tendency to project confidence even when their work has gaps or errors — to report success without flagging the caveats. Opus 4.8 is significantly better at catching its own mistakes. According to Anthropic's evaluations, Opus 4.8 is around four times less likely than Opus 4.7 to let flaws in code pass unremarked. For anyone using Claude for production code review or long-running agent workflows, that is a genuinely meaningful improvement.
Anthropics alignment team also concluded that Opus 4.8 reaches new highs on prosocial traits — supporting user autonomy and acting in the user's best interest — and shows rates of misaligned behavior substantially lower than Opus 4.7, comparable to Claude Mythos Preview, their most safety-tested model.
What Developers Are Actually Saying
Anthropic published quotes from a wide range of early testers. The patterns across them are worth noting because they are specific and consistent in a way that generic launch testimonials usually are not.
Cursor's CEO described Opus 4.8 as exceeding prior Opus models across every effort level on CursorBench, with tool calling that is meaningfully more efficient — fewer steps for the same outcome. Devin's CEO said it uses tools cleanly and follows instructions with the consistency autonomous engineering workloads need, and specifically noted it fixes comment-verbosity and tool-calling issues that appeared in Opus 4.7. Databricks described it as unlocking a step change in agentic reasoning, with 61 percent lower token cost than Opus 4.7 for their multimodal document workflows.
For browser agent tasks specifically, Browserbase reported Opus 4.8 scoring 84 percent on Online-Mind2Web — a meaningful jump over both Opus 4.7 and GPT-5.5. And the legal AI company Harvey reported it as the first model to break 10 percent overall on their Legal Agent Benchmark all-pass standard, which requires completing complex legal tasks end-to-end without errors.
These are not marketing claims. They are specific benchmark numbers from companies whose products depend on model performance.
Three New Features Launching Alongside Opus 4.8
The model upgrade is only part of what Anthropic shipped on May 28. Three platform features launched alongside it, and they significantly expand what is possible.
The first is dynamic workflows in Claude Code. This is in research preview and available for Enterprise, Team, and Max plans. It allows Claude to plan a task and then spin up hundreds of parallel subagents in a single session, verify the outputs, and report back. The example Anthropic gives is a codebase-scale migration across hundreds of thousands of lines of code, from kickoff to merge, using the existing test suite as the quality bar. That kind of task was previously only possible with significant human coordination; Claude Code with Opus 4.8 can now attempt it autonomously.
The second is effort control on claude.ai and Cowork. Users now see a control next to the model selector that lets them choose how much thinking Claude puts into a response. Higher effort settings produce better responses but consume rate limits faster. Lower effort settings are faster and cheaper. This is essentially the adaptive thinking feature made visible and user-controlled — available across all plans.
The third is a Messages API update that lets developers inject system entries inside the messages array mid-task. In practice this means you can update Claude's permissions, token budgets, or environment context as an agent runs, without breaking the prompt cache or routing the update through a user turn. For anyone building production agent pipelines, this is a significant quality-of-life improvement.
Fast Mode: Same Speed Promise, Much Lower Price
Fast mode — where Opus runs at 2.5 times its standard speed — launched with Opus 4.7 but at a price that made it hard to justify for most use cases. With Opus 4.8, Anthropic cut the fast mode price by 67 percent.
Standard Opus 4.8 pricing is unchanged from Opus 4.7: $5 per million input tokens and $25 per million output tokens. Fast mode for Opus 4.8 is now $10 per million input tokens and $50 per million output tokens. For comparison, fast mode on Opus 4.6 and 4.7 was $30 per million input tokens and $150 per million output tokens — three times more expensive.
This matters because fast mode is where Opus becomes genuinely competitive with lighter models on latency-sensitive tasks. At the old pricing, fast mode was a niche tool for specific high-stakes use cases. At the new pricing, it opens up for production workloads where Opus-quality responses are needed but standard Opus speed creates UX problems.
Where Opus 4.8 Fits in the Model Landscape
Anthropic now has a clear four-tier model lineup. Haiku 4.5 ($1/$5) handles high-volume, cost-sensitive tasks. Sonnet 4.6 ($3/$15) covers most production workloads where the cost-to-quality ratio matters. Opus 4.7 and Opus 4.8 (both at $5/$25) are for tasks where performance is the priority — complex agentic workflows, production-grade code generation, and high-stakes professional work.
The interesting question is when to use Opus 4.7 versus 4.8. Anthropic's answer is simple: use 4.8. Same price, better on every benchmark they measured, and it specifically fixes the tool-calling reliability issues that showed up in 4.7. The Devin team explicitly called this out. There is no cost reason to stay on 4.7.
For developers comparing Opus 4.8 against GPT-5.5 ($5/$30), the comparison is genuinely close depending on your use case. GPT-5.5 costs $5 more per million output tokens — which adds up at scale. Opus 4.8 leads on browser agent tasks (84 percent on Online-Mind2Web against GPT-5.5) and on legal agent benchmarks. GPT-5.5 may have edges in other areas. For most production workloads, the right answer is to benchmark both against your specific task rather than choosing on spec alone.
Not sure which model fits your workflow? Our free tool at aitoolsmentor.com/wizard asks six questions about your use case and budget, and gives you a personalised shortlist in 60 seconds.
What Comes Next
Anthropic was unusually direct about their roadmap in the Opus 4.8 announcement. They said they are working on models that provide similar capabilities to Opus at lower cost — a Sonnet-tier model with Opus-tier performance, essentially. They also confirmed that Mythos-class models (currently only available to a small number of organisations for cybersecurity work through Project Glasswing) are expected to come to all customers in the coming weeks.
That last point is worth paying attention to. Mythos Preview is Anthropic's most capable model and currently operates above Opus in the capability hierarchy. If Mythos-class capability becomes available to developers on standard API pricing, it would be one of the more significant model releases in the past year. Anthropic says the limiting factor is developing stronger cybersecurity safeguards before broader release, and they expect to complete that work soon.
For now, Opus 4.8 is the most capable model available from Anthropic. If you are building anything that pushes the limits of what an AI can do autonomously — complex coding, multi-step agent pipelines, document-heavy enterprise workflows — it is worth upgrading immediately. The API model string is `claude-opus-4-8`.