Product Management on the AI Exponential

March 21, 2026 · David Latz

Anthropic Blog · Head of Product Claude Code

TL;DR

Anthropic's Head of Product for Claude Code describes how exponentially improving models break the traditional PM playbook — and the four shifts teams need to stay on the curve instead of behind it.

Reasoning Seed

Tension: If technical feasibility changes faster than the planning cycle — is product management still planning, or just reacting?

Lab context: Cat Wu describes at Anthropic what we experience daily in the lab: product management on an exponential curve requires different planning horizons and more prototyping cycles.

Key Insights

1 — The PM Playbook Rests on an Assumption That No Longer Holds

Traditional product management assumes that what’s technologically possible at the start of a project is roughly what’s possible at the end. With exponentially improving models, this assumption breaks down: features that were impossible at sprint start become feasible mid-sprint. Wu illustrates this with her own experience — Claude Code failed at simple tasks with Sonnet 3.5, worked occasionally with Opus 4, and delivered reliable live demos with Opus 4.6.

2 — Four Operational Shifts for Teams on the Exponential

Wu distills four adaptations: Short sprints over long-term roadmaps — the team uses “side quests” (self-directed experiments outside the official roadmap) that produced features like Claude Code on Desktop. Demos over documentation — a rough prototype changes the conversation more than a polished spec. Re-evaluate features with every model release — the Chrome integration emerged when the team noticed users manually switching between Claude Code and their browser. Embrace simplicity — complex workarounds become obsolete when the next model solves the task natively. The team reduced system prompting by 20% with Opus 4.6.

3 — Three-Tool Workflow as Division of Labor

Wu describes a clear split: Claude.ai for strategic thinking and ideation, Claude Code for prototypes, evaluations, and scripts, Cowork for knowledge work, planning, and administration. Peers at Decagon and Datadog confirm similar hybrid workflows that dramatically shorten development cycles.

4 — 41x Improvement in 16 Months

Wu cites METR research: Opus 4.6 can complete software tasks that take a human roughly 12 hours — compared to 21 minutes with Sonnet 3.5. That’s an approximately 41x improvement in 16 months. This isn’t a linear trend but an exponential curve that fundamentally compresses planning horizons.

5 — From Engineering Tool to Organization-Wide Acceleration

The effect doesn’t stop at product and engineering. Data science, finance, legal, marketing, and design adopt AI-native workflows. The shift: instead of sequential handoffs between departments, parallel AI-powered work processes emerge across the entire organization.

Critical Assessment

What Holds Up

The observation that traditional roadmap planning doesn’t scale with exponential model improvement is empirically grounded and matches the experience of many AI-native teams
The three-tool workflow is a concrete, immediately applicable pattern — not an abstract framework
The METR data on 41x improvement is externally verified and provides a measurable reference point
The four shifts are operationally specific enough to be actionable yet generic enough for diverse team contexts

What Needs Context

Self-interest: Cat Wu is Head of Product for Claude Code at Anthropic — the piece is implicitly also marketing for Anthropic’s own product ecosystem
Survivorship bias: The described “side quests” work in a well-funded startup with top talent. Whether the model transfers to less resource-rich teams remains open
Metrics are missing: Beyond the METR reference, there are no quantitative measures for the claimed productivity gains from the new workflows
No failure mode: The piece describes only successes. Which side quests failed? When did a prototype set false expectations by arriving too fast?
Tooling lock-in: The three-tool workflow is based entirely on Anthropic products. Alternative stacks or hybrid setups are not discussed

Discussion Questions for the Next Lab

01 Roadmap Revision: If traditional roadmaps fail under exponential model improvement — how do we plan client projects where scope and feasibility can shift fundamentally during implementation?

02 Side Quests as Method: Wu describes self-directed experiments outside the roadmap as an innovation source. What would such a format look like in a consulting context — with fixed budgets, deadlines, and client expectations?

03 Prototype Culture: “Even a rough prototype changes the conversation” — how do we shift our own balance from documentation to demos without sacrificing architecture quality and maintainability?

04 Re-evaluation as Discipline: Every new model should trigger re-evaluation of existing features. How do we systematize this without ending up in permanent re-planning? What’s the right interval?

05 Organization-Wide Shift: Wu describes how not only engineering but also legal, finance, and marketing go AI-native. Which of our clients are ready for this shift — and where is the governance gap?

Sources

Glossary

Side Quest A self-directed experiment outside the official product roadmap. Serves exploratory innovation in environments with high uncertainty about future feasibility.

METR (Model Evaluation & Threat Research) An independent research organization that evaluates AI models for capabilities and risks. Provides standardized benchmarks for task complexity and agent performance.

System Prompting Instructions given to a language model before the actual user query to steer behavior, tone, and capabilities. Less system prompting with better models suggests the model infers more context independently.

Exponential Curve A growth pattern where capability doubles at regular intervals rather than increasing linearly. In the AI context: model capabilities grow faster than human planning typically anticipates.

Curated by David Latz · Panoptia March 2026

Related Field Notes

Documentation Is the New Interface — and Design Systems Are the Testing Ground

Apr 3, 2026 · Figma (Developer Documentation)

LLM Knowledge Bases: Why Everyone Lands on the Same Stack

Apr 3, 2026 · Andrej Karpathy

Claude Code's Source Code Leaked — What the Architecture Reveals About the Future of AI Agents

Apr 1, 2026 · Carl Franzen (VentureBeat)