Code Review That Actually Works

According to Microsoft Research, only 15% of code review comments relate directly to bugs. Most comments address maintainability and style. Here's how to make the other 85% of your review time actually count.

TL;DR

Enforce 200-400 line PR limits. Review within 24 hours. Rotate reviewers. Automate formatting and linting. Focus human review on logic, not style.

{{label}} Code Review Is Broken

→

If you've read Code Review Is Broken, you know the problems: 4+ day review delays, bottlenecked seniors, approval theater. The dysfunction is well-documented. What's less discussed is the concrete practices that make review actually work.

Having built and led engineering teams since the 90s (including at MSNBC where we shipped code daily) I've seen what separates teams that make review work from those stuck in approval theater. The patterns are consistent and repeatable. None of this is revolutionary. It's disciplined execution of basics that most teams skip.

The 200-400 Line Rule

PR size is the single biggest predictor of review quality. LinearB research found that reviews of PRs under 400 lines catch significantly more issues than large PRs. Beyond 400 lines, reviewer attention degrades rapidly.

What this means practically:

Enforce limits in CI. Block PRs over 400 lines from merging. Make exceptions require explicit approval.
Stack PRs when needed. Large features become 3-4 small PRs instead of one massive changeset. Review them in sequence.
Exclude generated code. Lock files, migrations, and generated code inflate line counts without adding review burden. Configure your tools to exclude them.

Teams push back on size limits because "breaking things up takes extra time." It does. But the total time (including review wait, context-switching, and bug-fixing) is lower with small PRs. The math works.

The 24-Hour SLA

Review delay kills productivity. According to Meta's engineering blog, the average PR sits 4+ days before review. Every day of delay costs context and momentum.

The fix: Establish a 24-hour review SLA. Not a guideline —an expectation with visibility.

Rate your team on each dimension to assess overall review health:

Typical first review time

PR size distribution

Reviewer distribution

Automation coverage

Comment quality

Track review turnaround time as a team metric. When reviews consistently exceed 24 hours, something is wrong: too few reviewers, PRs too large, or misaligned priorities.

Rotate Reviewers Deliberately

Most teams have 1-2 people who do 80% of reviews. This creates bottlenecks and concentrates knowledge. Worse, it prevents junior developers from developing review skills. As I explored in The Anatomy of a High-Velocity Engineering Team, the best teams distribute expertise rather than concentrate it.

Rotation patterns that work:

Round-robin assignment. Automatically assign reviewers in rotation. Override only for specialized domains.
Cross-seniority pairing. Junior developers reviewing senior code forces clearer communication and spreads knowledge.
Area ownership. Assign code owners by directory, but ensure owners aren't single points of failure.

The goal is every engineer doing meaningful reviews regularly. "I'm too busy" isn't acceptable—review is part of the job, not extra work.

One caveat: rotation works best when combined with code ownership. Having everyone review everything leads to diffuse accountability. The pattern that works is having designated owners for each area, but rotating who reviews within that ownership structure. The owner has final say, but different team members build familiarity through rotation. This prevents both the bottleneck problem and the "nobody really owns this" problem.

Automate the Obvious

Human reviewers should focus on logic and design, not formatting or style. Every minute spent on "add a newline here" is a minute not spent on "this edge case will crash in production." This is part of the layer tax—automation that should handle the mundane so humans can focus on judgment.

What to automate:

Formatting. Prettier, Black, gofmt —pick one and enforce it in CI. No human should comment on formatting ever.
Linting. ESLint, Pylint, and Clippy catch common mistakes automatically.
Type checking. TypeScript, mypy, etc. Let the compiler find type errors.
Test coverage. Block PRs that reduce coverage below threshold.
Security scanning. Dependabot, Snyk, etc. for known vulnerabilities.

If your CI doesn't catch it, humans won't consistently catch it either. Automate the automatable.

Write Reviews That Help

Microsoft Research on code review effectiveness found that the most useful comments identify functional issues, point out missing validation, or suggest better API usage. Style nitpicks were rated least useful.

Review comment hierarchy:

Bugs and security issues. "This will crash on null input." Top priority.
Logic errors. "This loop condition is off by one." High priority.
Missing cases. "What happens if the user cancels mid-operation?" High priority.
Design concerns. "This couples X to Y tightly. Consider an interface." Medium priority.
Clarity improvements. "This function name doesn't reflect what it does." Lower priority.
Style suggestions. "I'd write this differently." Only if truly improves readability.

Prefix comments with severity: "BLOCKING: this will cause data loss" vs "NIT: consider a clearer name." This helps authors prioritize.

PR Descriptions Matter

Microsoft's research also found that well-written PR descriptions are "one of the biggest time-savers during reviews." Yet most PRs have minimal descriptions.

What a good PR description includes:

What changed and why. Not just "fixed bug" but "fixed null pointer when user has no orders."
How to test it. Steps to verify the change works.
What to focus on. "Please scrutinize the caching logic" directs reviewer attention.
What NOT to review. "The migration file is auto-generated, skip it."

Good descriptions reduce reviewer load. They provide context that makes review faster and more accurate.

When to Skip Review

Not everything needs review. Review is expensive. Use it where it adds value.

Skip or fast-track review for:

Typo fixes. One-line documentation changes don't need two reviewers.
Generated code. Migrations, lock files, scaffolding.
Reverts. If you're reverting a broken change, ship it. Review later.
Emergency fixes. Production is down. Ship the fix. Review afterward.

Require thorough review for:

Security-sensitive code. Auth, payments, data handling.
Core infrastructure. Database schemas, API contracts, shared libraries.
New patterns. First use of a new library or architecture pattern.

Calibrate review effort to risk. Not all code is equally important.

Making It Stick

The hardest part isn't knowing what to do: it's getting teams to actually do it. Here's what works for adoption:

Start with measurement. Track PR size, review time, and reviewer distribution for two weeks. Show the team the data. Numbers make problems concrete and create urgency for change.

Automate enforcement first. Don't rely on willpower. Make formatters and linters mandatory in CI before asking humans to change behavior. Remove the friction of choice.

Lead by example. Senior engineers should submit small PRs, review quickly, and write thorough descriptions. Culture flows from observed behavior, not declared policy.

Celebrate progress. When review times drop or someone catches a significant bug, acknowledge it. Positive reinforcement builds habits faster than criticism.

The teams I've seen transform their review culture did it over months, not days. They picked one practice, nailed it, then added another. Gradual improvement beats grand reorganization.

Review Health Scorecard

Score your team's review process. Be honest—the point is to find what to fix first.

Dimension	Score 0	Score 1	Score 2
PR Size	Most PRs >500 lines	Most PRs 200-500 lines	Most PRs <200 lines
Review Turnaround	Average >48 hours	Average 24-48 hours	Average <24 hours
Reviewer Distribution	1-2 people do 80%+ reviews	Top 3 do 60% of reviews	Reviews distributed across team
Automation	Manual formatting checks	Some linting/formatting CI	Formatting, linting, types, coverage in CI
Comment Quality	Mostly style nitpicks	Mix of bugs and style	Bugs, logic, design focus
PR Descriptions	"Fixed bug" or empty	What changed, no context	What, why, how to test

The Bottom Line

Code review works when it's fast, focused, and shared. Small PRs, quick turnaround, rotated reviewers, automated basics. None of this is complicated. It's just discipline.

The teams that make review work treat it as a first-class engineering practice, not an afterthought. They measure it, optimize it, and hold each other accountable.

If your reviews take days and catch mostly style issues, you're paying the cost without getting the benefit. Fix the process or acknowledge that review is theater.

"Code review works when it's fast, focused, and shared. Small PRs, quick turnaround, rotated reviewers, automated basics."

Sources

Google Research: Modern Code Review - A Case Study at Google (ICSE 2018) — Seminal academic paper on how Google conducts code review at scale, finding that review serves education and knowledge transfer as much as defect detection
Microsoft Research: Expectations, Outcomes, and Challenges of Modern Code Review (ICSE 2013) — Foundational study on code review expectations vs reality across 17 teams at Microsoft
Microsoft Research: Characteristics of Useful Code Reviews (MSR 2015) — Research analyzing 1.5 million review comments to identify what makes feedback useful
Meta Engineering: Improving Code Review Time at Meta — Industry case study on reducing review latency from 4+ days, with specific metrics and interventions

Engineering Process Review

Code review dysfunction is a symptom of deeper process issues. Get an outside perspective on your engineering practices.

Get Assessment