Saturday, January 29, 2011

Code Review Part 1: Why do it?

Software products are complex constructions, with engineers often only understanding a tiny fraction of the code-base. Programmers often come from a background of solitary hacking, and this culture follows through to many of the software engineering companies they end up working in. Code gets written and checked in without a whiff of scrutiny. With time, it grows more complex, more siloed, more error-prone and less maintainable. Developers leave and others inherit a codebase they don't understand. In the end, schedules slip, things break, customers are unhappy and developers are stressed.

If your company writes software and doesn't get it peer reviewed, it needs to start doing so now because the above problems can all be solved or mitigated with code reviews.

What Are Code Reviews?

There are many different types of code reviews – Wikipedia separates them into two categories:

  • Formal
  • Lightweight
    • Over-the-shoulder
    • Email pass-around
    • Pair programming
    • Tool-assisted code review

While these each have their place, this article will treat "code review" as meaning a lightweight process where for every commit (on which it makes sense), the coder sends it, via a code review tool, to a peer to inspect and comment on before revising it and pushing it into the repository.

Why Code Reviews?

Quality

The obvious reason for having a code review process is to improve the quality of the code. There are a number of ways in which this happens:

  • Reviewers can reject commits
    • With fewer bad commits going in, the average quality of the codebase will be better.
  • Reviewers can comment on commits
    • More important than the previous point, code review is a constructive process where code is improved before being integrated.
  • Committers can learn from comments
    • Not only is the code under review improved, but code that hasn't even been written yet can also be improved! The next time the coder writes something, they'll keep the points made in the previous review in their mind.
  • Reviewers can learn from the code
    • The previous point is, of course, a two-way street. Reading code written by other people is a great way to improve your own ability.
    • Moreover, by having every commit understood by two people – the coder and the reviewer – the codebase becomes more transferable, essentially improving the bus factor.
  • Motivation to do "boring" tasks
    • By having someone push you, unit tests, documentation and changelog entries are more likely to be written, necessary refactorings are more likely to be done and the bug tracking system is more likely to be updated.

These are just some of the reasons why code reviews improve general code quality. One final point I'll make is that it's not only the code review process itself that produces good code. Even the thought of someone reading your code is a big driving force in ensuring that it is bug-free, easy to follow, tested, well-designed, and generally not embarrassing.

Morale

Besides quality, doing code reviews makes coding more fun! Part of the satisfaction of being a programmer is in knowing that what you're doing is correct. Unlike writing books, writing music, or even mathematical proofs, writing code gives you the ability to run it through a compiler or through some unit tests to give you a binary answer as to whether it's correct. However, compilers and unit tests leave huge gaps in testing (eg. unit tests can be incomplete/wrong, APIs and human interfaces can be hard to use, etc.). To plug these gaps, we can use the tool tried and tested by authors, musicians and mathematicians: peer review.

It's relatively easy to write code that compiles and passes some half-hearted tests and to churn out drivel into the repository like a monkey, but to craft code that stands up to the scrutiny of your peers and bears their mark of approval is another experience in itself.

Speed

Code reviews improve speed of delivery. This is a seemingly paradoxical and possibly controversial claim. How, you might ask, does requiring the extra steps of code review result in a faster workflow? Well in my experience, this follows from the previous results:

Firstly, it is well accepted that defects found earlier end up costing less in the long run. Code reviews aim to catch defects earlier than almost any other quality control measure, and thus reduce the amount of time you spend fixing them later on.

Secondly, having your code approved by a peer gives you greater confidence in its correctness, and greater confidence lends to higher productivity. Even simply working closely with a peer gives you greater motivation to power through the code. In my experience, I've found that the greater visibility, recognition and accountability within the team afforded by the code review process is a great driver for inspiration.

Code Reviews Everywhere!

By now, the case for doing code reviews should be pretty clear. In fact, I'm going to make the bold claim:

For any software team not already doing so, the single most effective thing it can do to improve product quality is to introduce a code review process.

I put code reviews above (but not as a replacement for) all other quality assurance processes. If your team has no quality processes, start with code reviews and use them as a platform to later introduce automated testing. If you're already doing unit/regression/system/visual/whatever tests, you're still leaving gaping holes by not having qualitative metrics for code quality.

The reason is that peer review is the most applicable quality assurance process I know of. For instance, unit tests don't apply at the very fringes of the system, black box system testing can often be misguided by not understanding the distribution of complexity in the code, and testing UIs is notoriously tedious work. On the other hand, it's reasonable to say that any code that can be written by a human can be reviewed by a human.

So what are you waiting for? It's time to start reviewing some code! But before we do that, we need some decent tools for doing so; something I'll talk about in Part 2...