22 Jun 2026 • 8 min read

It's Time to Retire Code Reviews

Code reviews made sense when coding was slow and expensive. Then AI broke the rules. Thousands of lines a minute, reviews still taking hours, senior engineers drowning. The fix isn't more AI. It's moving the leverage upstream: stop reviewing what was built, start aligning on what will be built.

#AI #Technology

Segovia Cathedral. Such a masterpiece was surely created with a plan.

I never liked code reviews.

Understanding someone else's code is hard. Scrolling through the narrow diffs, trying to infer context and intention from those scattered code chunks. Hunting for misaligned design and the butterfly effects in the legacy code base, caused by innocent looking changes. Difficult, time consuming and exhausting.

But effort was never the biggest problem. In my opinion, code reviews have always approached the problem from the wrong end.

Code review happens after the most expensive step - the implementation. All the important mistakes, the flawed design, the architecture misalignment, the broken requirements, are already baked in the code that took several days to write. Fixing them now means rewriting it from scratch. Another days of work! Another delayed release.

When the cost of change is so high, one of two outcomes happens: either ego clashes and engineers start arguing on what is the correct design, or conflict-avoidance kicks in and code review is just brushing the surface, checking silly things like formatting or typos. Either way, the purpose of the whole process, is lost.

All this caused by applying the leverage in a wrong place.

Furthermore, typical code review lacks intent. What are you trying to achieve, why did you take this approach? What other possibilities did you consider? What was your reasoning process? All this is missing from the typical Pull Request. Sure, you can include all this information in the PR description, but, yet again, all this happens after the implementation.

And Then, Vibe Coding Happened

Code reviews, with all their flaws, still made sense when coding was slow and expensive. Days to write a Pull Request, hours to review it properly. Fair trade.

Then AI came. And broke the rules.

Even Google acknowledged at I/O 2025 that code review is now the bottleneck, and they don't know how to solve it. With 50% of Google's code already AI-generated (pushing toward 75%), the pressure is only going to increase. If the company with the most engineering resources on the planet can't solve this, no one is patching their way out of it.

AI Agents have inverted the trade. Coding is now cheap and fast. Thousands of lines of code generated per minute. Entire features, thousands of lines, crammed into a single merge request. Even more PRs. Bigger than ever. And reviews take even more time now as lack of a human on the other side makes reviewers more suspicious and careful.

Senior engineers are drowning. Tickets stuck in "In Review" for days. Burnout creeping in quietly, review by review.

And to make things worse, AI brings its own category of defects. The code looks good. But it's over-simplified, or quite the contrary, over-engineered with unnecessary layers of abstraction. Too narrow context causes agents to duplicate solutions and produce designs misaligned with the rest of the codebase. Those are precisely the hardest problems to catch in a code review.

The whole process is buckling under the load it was never built for.

More AI Won't Fix This

The obvious solution is more AI. Let it write. Let it review. Let it fix. A fully automated software factory, churning out features at superhuman speed.

Not so fast.

AI reviews are a thing and they catch real bugs. But there's a whole category of issues entirely out of their reach: human context, product vision, high-level architecture, alignment across modules and systems, the judgment on what good even looks like. That's not something an AGENTS.md file can solve.

AI is great at coding. It's bad at software design. It's locally reasonable but globally incoherent. You can delegate the coding. You cannot delegate the decision of what to build, how it fits the system, and how it will evolve. That remains a human capability.

So the bottleneck was never speed. Speed is solved. The bottleneck is alignment. It's design. It's understanding. Agents need guidance and control, just not the kind built for the old era.

The Old Process Is a Fossil

Traditional formula of Code Reviews was not intentional. It was a consequence.

Our classic SDLC ran as a tiny waterfall inside each sprint: Requirements → Refinement → Planning → Implementation → Review → QA → Deploy. Features sliced into small tickets, distributed across specialisations, carefully estimated, tracked on the board. And at the end of each chunk: a review. Because the chunk was small enough to review.

Small, incremental Merge Requests weren't just a best practice. They were a direct byproduct of small, incremental, human-paced implementation. The review process was shaped by the constraints of the work that preceded it.

None of those constraints exist anymore.

Why split a feature into chunks when a full implementation takes minutes?
Why estimate effort when the machine does the effort?
Why split frontend and backend work if AI is full stack by design?
Why keep PRs small if everything else can be done in one go?

The constraints are gone. But the habits remain. We're applying a review process designed for slow, manual craftsmanship to industrial-scale code generation, and wondering why it's breaking.

Finding the Leverage

Cheap code created a dangerous illusion: that planning no longer matters. We no longer have to spend agonising hours prioritising feature requests or researching what to build. Build everything! Ship 100 features and see what sticks.

Quite the contrary! Wasted work, architecture bloat, quality debt and lack of alignment is still costly. The bill is just less visible now. And it compounds. With coding being commoditised, aligning on what to build is the new bottleneck. Everyone can generate code. Not everyone knows what is worth creating.

This is where upfront planning gives you the most leverage. Redoing already implemented work is brutal, even with AI. Burned tokens, bloated context windows, drifting focus and compounding mess in the codebase and specification. All avoidable if you move alignment upstream, to before the work begins.

The solution is simple: make sure the agent knows what to build before it builds. Catch wrong assumptions, misread requirements, and misaligned designs while they're still a paragraph, not thousands of lines to untangle.

I've been experimenting with different approaches but recently I've settled down on a two-step flow that mirrors how I usually work with my junior engineers. For each of their task, we align on what needs to be done and how it's going to be implemented.

Design Overview - The What

The Design Overview is a one-pager that aligns everyone on what needs to be built. Before anyone touches code.

The agent starts by cross-referencing the feature requirements against system documentation and the existing codebase. It surfaces gaps, ambiguities, and missing context, then reaches out to the human for clarification. Nothing moves forward on assumptions.

The resulting document captures three things: the current state of the system, the desired state after the feature lands, and the design decisions and clarifications provided. It also identifies which modules and services will be affected, and points to reference implementations - places in the codebase where similar problems have already been solved.

No deep technical detail yet. The goal at this stage is to lock in scope and anchor the work to existing patterns. Not to plan the implementation.

The document closes with detailed acceptance criteria: a precise, testable definition of what done looks like for this feature.

Both the engineer and the product manager review and approve it. Engineering confirms the scope is coherent and technically grounded and that it has not missed any important system components. Product confirms it matches intent.

The review stays deliberately focused: no implementation details, no technical noise. Just scope, intent, and acceptance criteria. That's the point. Each step is designed to be reviewed by the right people, on the right level of abstraction.

Once approved, the Design Overview is used as a starting point for the next step: Implementation Plan

Implementation Plan - The How

The next step picks up exactly where the Design Overview ends. The agent goes deep into technicalities, preparing a detailed Implementation Plan.

Because the agent isn't starting from a vague ticket or a bare prompt, but from an approved, structured document, the plan arrives well-grounded from the first draft. It already knows which parts of the system are in scope, which design patterns to reuse, how the feature integrates with existing services, and what the solution has to achieve. The Design Overview did that work upfront.

The first thing the agent does is translate the acceptance criteria into concrete automated test cases. These become the verification layer for the final implementation: the definition of done made executable. The criteria defined in the Design Doc become the tests that verify the completeness of the implementation.

From that foundation, the agent produces a detailed, step-by-step implementation plan: what gets built, in what order, using which patterns, integrating with which services.

The engineer reviews the plan. Because it's concise and explicitly linked back to the Design Overview, the review is fast, discrepancies between the two documents surface immediately. Splitting planning into two steps means each review is narrower, faster, and harder to derail.

The engineer can correct the agent's direction where needed, and flag which parts of the resulting implementation will need closer code review later.

Once approved, the Implementation Plan goes directly to the coding agent as its primary context: scope, design constraints, integration points, patterns to follow, and test cases to satisfy. The agent doesn't have to infer intent. It's all there.

By Failing To Prepare, You Are Preparing To Fail

The flow above works for me, but it's just an example. The specific process matters less than the principle: review planned work early, where the leverage is highest - before implementation starts.

Plans are faster and easier to review. One or two pages of structured text instead of thousands of lines of code. A good plan tells reviewers what the agent will build, what design patterns it will apply, how it integrates with the rest of the system. Discrepancies are visible and cheap to fix. The plan also carries intent and reasoning: the why behind every decision, not just the what.

The plan, on itself, is a first-class artifact. It can be saved as documentation. Collaboratively reviewed and iterated. And, critically, fed directly to the implementation agent as context, design reference, and constraint. The plan doesn't just precede the work; it shapes and guides it. It's a starting point for next steps in the delivery process.

Finally, It fits naturally into automated delivery pipelines. The plan becomes the ideal human-in-the-loop checkpoint: agents generate it, flag blockers, reach out for clarification, and wait for sign-off before triggering implementation. Structured, auditable and at exactly the right moment, where the leverage is the highest.

Does this make code review extinct? No. But it becomes a specialised tool, applied only when needed. Security boundaries, auth flows, performance bottlenecks, payment handling, data integrity: places where the blast radius justifies additional time investment. The plan itself tells you which surfaces need that attention.

The heavy review moves upstream to the plan. What survives at the code level is narrow and purposeful. Not line-by-line on every MR, but a targeted safety check where it actually matters.

The Future Is Still Ahead

Planning docs are a more efficient way to steer and control agents. But they're still a habit from the old world. RFCs, Design Docs, Architecture Decision Records - all of these predate the AI era. We're borrowing the closest thing we have and making it work. It's a good start.

We need a new generation of tools. Lightweight, collaborative, designed from scratch for agentic workflows. Tools that verify an agent's direction before it builds, keep teams aligned across the entire delivery process, and propagate the kind of knowledge that used to live in a senior engineer's head. The things code review tried to do, and increasingly couldn't.

The industry is already experimenting. Spec-driven workflows, agent collaboration environments, session-linked git history. Github ACE, Entire, DeltaDB, Origin. The shape of what comes next is visible. The details are still being worked out.

But the direction is clear. Stop reviewing what was built. Start aligning on what will be built. Move the leverage upstream, before the first line of code is written.

Code review had its time. That time is over.

What replaces it is still being written, by some AI agent, probably.

Thanks for reading! Subscribe for more.

Newsletter for Engineering Leaders

Subscribe to ManagerStories Newsletter for real stories behind leadership wins and epic failures. Subscribe for weekly insights that actually help when your carefully laid plans meet the unpredictable reality of managing engineers.