Why Technical Interviews Test the Wrong Thing

LeetCode measures puzzle-solving speed. Production engineering requires judgment.

Illustration for Why Technical Interviews Test the Wrong Thing
technical-interviews-broken Technical interviews optimize for competitive coding skills that rarely matter in actual work. What we should test instead - and how hiring could improve. technical interviews, hiring, LeetCode, software engineering, recruitment

The problem is clear: according to Interviewing.io's analysis, LeetCode scores correlate only 0.27 with actual job performance - barely better than a coin flip. The best engineer I ever hired couldn't invert a binary tree on a whiteboard. But he could debug production at 3am, design systems that scaled, and ship features users loved. We've optimized interviews for the wrong signal.

TL;DR

Replace algorithm puzzles with real work samples. Use pair programming on actual problems. Test collaboration, not memorization.

{{label}} Interview Alternatives That Actually Work
covers the evidence-based hiring methods that predict success.

The logic is sound on paper.

I've hired engineers for 30 years. I've watched the industry converge on an interview process optimized for the wrong signals. Here's what's broken and what we could do instead.

What LeetCode Actually Measures

Be precise about what algorithmic interviews test:

Pattern recognition under time pressure. Can you recognize that this is a dynamic programming problem and apply the pattern you memorized? Can you do it in 45 minutes while someone watches?

Preparation investment. Did you spend 200 hours grinding LeetCode? This correlates with wanting the job, not with job performance.

Performance under artificial stress. Can you think clearly while being evaluated in an unnatural environment? Some people can. Many excellent engineers can't.

Specific knowledge recall. Do you remember the optimal algorithm for this specific problem? Knowledge that's instantly available via Google in actual work.

These are skills. They're just not the skills that make someone effective in a software engineering role.

What Production Engineering Requires

What actually matters when building software professionally:

Judgment about tradeoffs. Should we optimize for speed or maintainability? Build or buy? Microservices or monolith? Perfect or shipped? These decisions shape outcomes more than algorithmic efficiency.

Communication. Can you explain technical concepts to non-technical stakeholders? Can you write documentation that others understand? Can you disagree constructively in code review?

Debugging complex systems. Production issues involve multiple interacting components, incomplete information, and time pressure. The skill is methodical investigation, not algorithmic cleverness.

Learning new domains. The codebase you'll work on uses technologies you haven't seen. How quickly can you get productive in an unfamiliar environment?

Collaboration. Software is a team sport. Can you work effectively with others? Give and receive feedback? Support colleagues who are struggling?

Sustained productivity. Not heroic sprints but consistent output over months and years. Managing your own energy, avoiding burnout, maintaining quality when you're tired.

Knowing when not to code. Sometimes the right answer is "don't build this." Recognizing when a problem doesn't need a software solution - or when the existing solution is good enough.

None of these appear in algorithmic interviews.

The Correlation Problem

The defense of LeetCode interviews is usually "it correlates with job performance." Let's examine that claim:

Survivorship bias. Companies that use LeetCode interviews hire people who pass LeetCode interviews. They have no data on the people they rejected. The correlation is between "hired" and "succeeded" among people who passed a filter, not between "LeetCode skill" and "engineering ability." Research from NC State and Microsoft found that performance is reduced by more than half simply by being watched during a whiteboard interview.

Self-fulfilling prophecy. If you hire people who are good at algorithms, and you value algorithmic elegance in code review, and you promote people who optimize algorithms - yes, algorithmic skill will correlate with success at your company. You've built a monoculture.

Base rates matter. Software engineering roles attract generally capable people. If you hired randomly from your applicant pool, you'd probably get decent engineers. The interview's job is to improve on random selection, and the improvement is smaller than companies believe.

What gets measured gets managed. When LeetCode performance determines hiring, candidates optimize for LeetCode. When algorithm knowledge is the filter, you hire algorithm specialists. This doesn't mean algorithms predict job performance - it means you've selected for them.

The Real Reason Companies Use LeetCode

If algorithmic interviews are poor predictors, why do companies use them?

Legal defensibility. A standardized test with consistent scoring is easier to defend against discrimination claims than subjective judgment. "We hired based on objective performance" is a legal strategy, not an engineering strategy.

Scale. When you're interviewing thousands of candidates, you need a process that's consistent and easy to administer. LeetCode scales. Good judgment doesn't.

Cargo culting. Google does it. Facebook does it. Therefore it must be right. Companies copy interview processes from successful companies without asking whether the process caused the success. It's the same pattern you see with Agile methodologies - copying rituals without understanding principles.

Risk aversion. Nobody gets fired for running a standard interview process. Trying something different and having it fail is career risk. Doing what everyone else does provides cover.

Filtering for dedication. LeetCode grinding takes time. Requiring it filters for candidates who really want this specific job. That's a signal, but it's not the same as "good engineer."

What Interviews Should Test Instead

Better approaches exist. They're harder to standardize, which is why companies avoid them:

Work sample tests. Give candidates a realistic task similar to actual work. Review a pull request. Debug a failing test. Add a feature to a small codebase. Evaluate the work product, not the performance under surveillance. According to Schmidt and Hunter's meta-analysis, work sample tests are the single most predictive activity throughout the hiring process.

Take-home projects. Controversial because they take candidate time, but they show what someone produces when they're not being watched. The code someone writes at home is closer to the code they'll write at work than whiteboard code.

System design with tradeoffs. Not "design Twitter" but "here's a specific problem, here are the constraints, here are three possible approaches - walk me through how you'd decide." Look for judgment, not memorized architecture patterns.

Debugging exercises. Give candidates a broken system and watch them investigate. Do they form hypotheses? Test systematically? Know when to ask for help? This is core engineering work.

Code review. Show candidates code with problems - bugs, style issues, performance problems, missing tests. How do they analyze it? How do they communicate feedback? This tests daily skills.

Past work discussion. Deep conversation about systems they've built. What decisions did they make? What would they do differently? What did they learn? Look for reflection and growth, not just accomplishment.

How To Hire For Judgment

The hardest thing to evaluate is judgment - the ability to make good decisions in ambiguous situations. Some approaches:

Scenario questions with no right answer. "Your team wants to rewrite this system from scratch. Half think it's essential, half think it's a waste. How do you decide?" There's no correct response - you're looking for how they think through uncertainty.

Disagreement questions. "Tell me about a time you disagreed with your team's technical direction. What happened?" Good engineers can disagree productively. Great engineers can change their minds when they're wrong. Leaders who can't handle disagreement often have ego problems that kill startups.

Failure questions. "Tell me about a technical decision you regret." Self-awareness about mistakes predicts learning and growth. Beware candidates who've never been wrong.

Tradeoff questions. "We could build this quickly with technical debt, or slowly with clean architecture. How would you think about that decision?" Look for nuance, not ideology.

When Algorithmic Interviews Make Sense

I'm not saying LeetCode is always wrong. It makes sense when:

  • The role involves actual algorithms. If you're building search engines, compilers, or ML infrastructure, algorithmic thinking is the job. Test what you need.
  • You're hiring at massive scale. When you're processing 100,000 applicants, standardization has real value. The false negatives hurt less than the operational chaos of bespoke evaluation.
  • The candidate pool is homogeneous. New grads from CS programs have similar backgrounds. Algorithmic tests compare apples to apples in ways that work samples can't.

But for most engineering roles - especially senior ones where judgment matters more than puzzle-solving - the process tests the wrong things.

The False Positive Problem

Companies worry about false negatives - rejecting good candidates. They should worry more about false positives - hiring people who interview well but perform poorly.

LeetCode optimizes against false negatives at the cost of false positives. It rarely rejects someone who memorized enough patterns. But it tells you nothing about whether they can:

  • Work effectively on a team
  • Handle ambiguity and changing requirements
  • Communicate clearly with stakeholders
  • Stay productive over the long term
  • Mentor others and contribute to culture
  • Make good decisions under uncertainty

These are the things that actually determine whether a hire succeeds. They're also the things LeetCode doesn't measure.

What I Look For

After 30 years of hiring, here's what I actually evaluate:

Curiosity. Do they ask good questions? Are they interested in understanding the problem deeply? Curiosity predicts learning and growth.

Clarity of thought. Can they explain something complex simply? Do they structure their thinking? Can they be precise about what they know and don't know?

Self-awareness. Do they know their strengths and weaknesses? Can they talk about failures without defensiveness? Do they seek feedback?

Collaboration signals. How do they respond to pushback? Do they listen before defending? Can they build on others' ideas?

Evidence of impact. Not "I built X" but "I built X and here's what happened." Can they connect their work to outcomes?

Growth trajectory. Where were they two years ago? What have they learned? Are they getting better?

These are harder to evaluate than LeetCode performance. They're also more predictive of success.

Interview Process Quality Scorer

Score your current hiring process. Check what your interviews actually evaluate.

Broken Signals (What LeetCode Measures)
Predictive Signals (What Actually Matters)
0Broken
0Predictive
Check your process above

Interview Process Quality Scorecard

This interactive scorecard requires JavaScript to calculate scores. The criteria table below is still readable.

Score your hiring process. High scores indicate you're testing what actually matters.

DimensionScore 0 (Broken)Score 1 (Mixed)Score 2 (Effective)
Primary SignalAlgorithm puzzlesMix of puzzles + work samplesWork samples + past work discussion
Judgment TestingNone—only right/wrong answersSome open-ended questionsTradeoff scenarios, ambiguous problems
Collaboration SignalSolo whiteboard onlyPair programming sometimesCode review + collaborative debugging
Real Work SimulationNever touch actual codeTake-home availableOn-the-job trial or realistic project
Failure DiscussionNot askedAsked but not weightedValued—self-awareness matters
Interviewer TrainingNone—anyone can interviewSome calibrationStructured training, bias awareness

The Bottom Line

Technical interviews test what's easy to test, not what matters. LeetCode measures puzzle-solving speed. Production engineering requires judgment, communication, collaboration, and sustained productivity.

The best engineers I've worked with would fail many FAANG interviews. The worst engineers I've worked with often passed them easily.

If you're hiring, question the process you inherited. If you're interviewing, recognize that failure doesn't mean you can't engineer. The test is broken, not you.

"The best engineers I've worked with would fail many FAANG interviews. The worst engineers I've worked with often passed them easily."

Sources

Better Hiring

Building teams that actually perform. Hiring strategy beyond whiteboard puzzles.

Let's Talk

Think I'm Wrong?

Contrarian takes invite contrarian responses. If you think I'm missing something important, tell me.

Send a Reply →