Opus 4.6, 1M Context by Default: Transformer Killers, For Real This Time?
Why would you make 1M context the default at no extra cost unless something changed architecturally?
Anthropic just made 1M context the default on Opus 4.6, at no extra cost.
The interesting part isn’t the context length itself. It’s that it’s the default and it doesn’t cost more. Why would you do that unless you have real efficiency gains? Maybe there are hardware/deployment reasons where supporting a single context size is simpler, but to me this could be a sign that hybrid architectures are actually in use at frontier labs.
Transformers + attention has been the backbone of these models since the original paper in 2017. There have been optimizations, but the core architecture hasn’t really changed. The “transformer killers” (state space models, gated delta networks, etc.) have been “coming soon” for years.
The short version: transformers scale quadratically with context length, O(n²). SSMs keep a fixed-size hidden state and process tokens in O(n). That’s the kind of difference that would make 1M context free instead of expensive.
I don’t know if that’s what’s happening here. But the combination of 1M-by-default-at-no-extra-cost and the recent progress on hybrid architectures is at least suggestive. If it is, expect even larger default contexts from here.