Anthropic launches Claude Opus 4.8

Discussion

Opus 4.8 is out, available the same day it was announced, and the headline pitch is judgement rather than raw capability. Anthropic says the model is more honest about its own progress and can run independently longer than 4.7, and early testers report it was roughly four times less likely to wave through flaws in code it had just written — it flags its uncertainties instead of papering over them. That self-skepticism is the kind of thing that actually changes how it feels to hand an agent a long task. The benchmark deltas back up a steady-rather-than-dramatic story: agentic coding climbs from 64.3% to 69.2%, multidisciplinary reasoning with tools from 54.7% to 57.9%, agentic computer use barely moves (82.8% to 83.4%), and the knowledge-work score jumps 1753 to 1890. Pricing is unchanged from 4.7 and the fast mode is about 2.5x quicker. Anthropic also says a Mythos-class model for all customers is coming in the next few weeks, so this may not be the release people remember in a month.

└─MacRumors

2026-05-28