Opus 4.7 launches with mixed reception — the MRCR regression and the 4.6-rollback theory

Anthropic shipped Opus 4.7 on April 14. The community read: it's 4.6 with a fix, MRCR retrieval regressed, and the thinking-effort toggle vanished. The discourse is sharp.

C Charles Lin · April 15, 2026

Anthropic shipped Claude Opus 4.7 on April 14, 2026. The launch thread on r/ClaudeAI hit 3,351 upvotes — which sounds enthusiastic until you read the comments. The dominant community sentiment was skepticism, with a sharp focus on three specifics: long-context retrieval (MRCR) regressed from 78.3% to 32.2%, the thinking-effort UI toggle was removed, and the working theory is that 4.7 is 4.6 with a quality rollback rather than a new training run.

Theo’s April 14 video — “Anthropic thinks they’re Apple. They’re actually hypocrites.” — captures the broader narrative the launch landed into. This wasn’t a “ship a great model into a clear sky” launch. It was a “ship a fix-shaped model into months of accumulated complaints” launch — and the community knew the difference.

What 4.7 actually changed

From Anthropic’s announcement and the launch thread:

“Handles long-running tasks with more rigor.” Marketing framing for what looks like reasoning-effort recalibration.
“Substantially better vision.” 3x resolution; visible improvements on slides, UI generation, document understanding.
“Verifies its own outputs before reporting back.” Internal verification step before tool returns — meant to reduce hallucination on agent workflows.
Available immediately on Claude, Claude Code, and API.

The vision improvements are real and shipped. The “more rigor” framing is what the community is contesting. From the r/ClaudeAI launch thread top comments:

“Great finally we get the real 4.6 back!” — 483 upvotes

“Long context retrieval: MRCR v2 @ 1M tokens — 4.6: 78.3% — 4.7: 32.2% ⚠️ Regression — long-context retrieval is worse” — top technical comment, citing the system card

“They turned off thinking effort settings for Opus 4.7 in the Claude App :|” — 363 upvotes

That MRCR number is the single most consequential criticism. Multi-Round Conversation Recall at 1M tokens dropped by more than half. Anthropic shipped that in the system card themselves; this wasn’t a Reddit conspiracy. Boris (Claude Code team) followed up acknowledging the regression — which is honest, but doesn’t change the fact that the model that just launched is materially worse on long-context retrieval than the one it replaces.

The “is this 4.6 plus a fix” theory

The widely-circulating community theory: Opus 4.7 is the pre-degraded 4.6 weights, rebranded. The argument:

r/ClaudeAI through March documented a reasoning-effort regression in 4.6 (4,352 upvotes) — the “car wash test” failures, silent quality degradation, no changelog
Anthropic acknowledged that on March 4 they changed Claude Code’s default reasoning effort from high to medium to reduce latency, then reverted it on April 7 after pushback
One week after the revert, 4.7 launches with “more rigor” framing and the thinking-effort UI toggle removed

The theory connects the dots: rather than admit 4.6 had been silently degraded for months, Anthropic shipped the un-degraded version as “4.7” — getting credit for shipping while papering over the regression. This is unprovable without Anthropic releasing weights diffs, which won’t happen. But the timing is suggestive enough that the community has settled on this read.

Creator POV vs Reddit dissent

Theo’s framing is sharp and adversarial. The Hypocrites video lands the argument that Anthropic-the-company has positioned itself as the “responsible AI lab” while shipping practices (silent quality changes, paid users hitting usage limits, models that retain training-data biases the company publicly criticizes) that don’t match the brand. His follow-up “killed Figma” video on April 21 reframes 4.7’s vision improvements as Anthropic moving directly into Figma’s territory — which is a positive technical read but layered on top of the broader trust narrative.

Fireship’s coverage — “Claude just got another superpower…” — takes the inverse framing: 4.7’s improvements are real, the vision boost is impressive, the agent verification step matters. This is the consensus among creators who use Claude as a tool rather than diagnose Anthropic as a company.

The Reddit dissent is louder than the creator dissent. Top comments in r/ClaudeAI’s “Stop shipping” thread (3156 upvotes):

“I cancelled my $200 sub. I HOPE to re-sub. But honestly cancelling it is the ONLY way to make them listen.” — 438 upvotes

“Shipping stuff like features is fine, shipping a ‘fixed price usage plan’ that they never had the capacity to support is the real issue.” — 83 upvotes

The substantive critique is about usage limits and capacity, not model quality per se. Users feel they were sold a service (“Max plan, $200/mo, use Claude”) that doesn’t deliver the promised capacity. 4.7 doesn’t fix that; if anything, the higher reasoning effort burns through quota faster.

What this means for working engineers

Three honest considerations:

1. 4.7 is the right default for new work in mid-2026. Whatever the launch narrative, the deliverable is better than 4.6 on most tasks. Vision is significantly better; verification reduces some agent failure modes. For new code-gen work, default to 4.7.

2. If you have long-context retrieval workflows, validate before switching. The MRCR regression is real and consequential. If you have a workflow that depends on long-context recall (large codebase navigation, long-document analysis, multi-round agent loops), test 4.7 against 4.6 on your actual workload before committing.

3. The usage-limits friction is the real cost. Across r/ClaudeAI in April 2026, the recurring complaint is “I run out of usage.” Even with the same model quality, hitting your weekly limit kills productivity. If your team is on Max-tier subs, plan for capacity bursting strategies — fallback to API pricing, alternative models for non-critical work, scheduled batch operations.

The honest critique

What the discourse gets wrong:

The “4.6 with a fix” theory is plausible but unfalsifiable. Believing it requires assuming Anthropic acted in bad faith. Believing the opposite requires assuming the timing is coincidence. Reasonable people will land in different places.
MRCR is one benchmark, not the whole story. Long-context retrieval matters, but for most workflows under ~200K tokens, 4.7 performs better than 4.6.
Anthropic shipped the regression in the system card. That’s the actual signal here — they didn’t hide it. The criticism is fair but should reckon with the fact that the bad number was published, not buried.

For most working engineers in mid-April 2026: upgrade to 4.7 by default, validate long-context workloads, plan for usage limits, and watch how Anthropic handles the next ship cycle. The interesting question isn’t whether 4.7 is good — it’s whether the 4.6 silent-degradation pattern was a one-time misstep or a structural reality of running inference-capped hosted models.

Sources

Every reference behind this piece. If we make a claim, it's because at least one of these said so — or we lived it ourselves.

YouTube Theo (t3dotgg) — "Anthropic thinks they're Apple. They're actually hypocrites." — Theo / t3dotgg
YouTube Theo (t3dotgg) — "Did Anthropic just kill Figma?" — Theo / t3dotgg
YouTube Fireship — "Claude just got another superpower..." — Fireship
Docs Anthropic — Claude Opus 4.7 announcement — Anthropic
Blog r/ClaudeAI — "Introducing Claude Opus 4.7" launch thread (3351 upvotes) — r/ClaudeAI
Blog r/ClaudeAI — "Anthropic: Stop shipping. Seriously." critique thread (3156 upvotes) — r/ClaudeAI
Firsthand Tracking Claude model versions across Opus 4.5, 4.6, 4.7 in production workflows