So it has happened again. The old fifteen-years-and-counting-project-that-just-won’t-die has lost its last maintainer. And now that nobody’s left who knows anything about the code base, the powers that be take notice and assign a new “team”, consisting of one person: You. Take two days to examine the code, then tell us how long it’ll take to implement this extra long list of new features we want yesterday. No problem, right?
Here’s my personal checklist for that special occasion we all know and love. If you want to use it, consider it a general guide, but don’t follow it by the letter. Every project is a little different. Use your head. And read “Working Effectively With Legacy Code”.
Before you touch anything, get everything there is stored in a git repo. If it’s from CVS or Subversion or whatever else, import into Git. Or some other system of your choice you are familiar with. Don’t experiment here.
Now that you have your shiny new repo, use it. Branch. Commit. Don’t be shy. And use good commit log messages. Bonus points for selecting the correct character encoding if some idiot used German umlauts or French accents right inside file names.
2. Make it build
In an ideal world, you can check out a piece of code and build it right out of the box. Experience tells a somewhat different story. Missing libraries, hard-coded paths to the (only) developer’s computer, whatever. Get the system to build with a single command right after checking it out. I want to be able to do
git clone whatever; make build; make test
…and get a working system. Use make, ant, msbuild… you get the idea. Add JUnit or NUnit or whatever, and at least one dummy test. Don’t worry about generating burnable ISO files with a single command yet.
3. Look out for reflection
4. Rename All The Things
Seriously. When I take a legacy system apart, I rename all namespaces/packages and all classes (careful with reflection here). If a class was called “DokGenChaRefÜbTrnX” (sadly, I’m not even kidding here), use that handy refactoring menu your IDE hopefully provides you with, and rename that sucker to “TODO_1_DokGenChaRefUebTrnX” or the like. Get rid of non-ASCII characters and make it clear that this class name will not remain for long.
Build, run and test the system. Make sure it still works. If not, curse reflection and start digging.
5. Take the classes apart
Now tackle all classes one by one. Open file. Cross-read the 7500 lines of code, taking note of the outdated comment lines, method names and so on. Try to figure out what the class is supposed to do. Write down all the things the class seems to be concerned with. Now rename the class again to reflect its main purpose. Change method names to reflect what they seem to be doing. Now our example class might be called “TODO_2_IniFile”, even though it also takes care about displaying a dialog, reading specialized configuration data for this project only and whatnot.
At this point you should have a basic understanding of the code, and can guesstimate how long it’ll take to get the project into a halfway maintainable state. However, depending on how much time you have, digging deeper will get you better estimations, so go ahead!
6. Single Responsibility Principle
Now that all our classes have halfway appropriate names, take them apart. Try to move out everything that belongs into another class. Create new classes as you go. Sometimes it is necessary to make a big “BallOfMud” class that contains all methods you’re not sure about at the moment, or that do too much. Rename our example class to “TODO_3_IniFile”. Create a new class “Configuration” for the specialized config data, and a new “[…]gui.ConfigurationDialog” for the GUI part.
It is at this stage that I start to refactor methods. Extract methods where appropriate. Try to get some basic code reuse in. Try to make methods static so they can be moved into other classes easier. Don’t be afraid of creating 15-parameter monsters. We’ll take care of them later.
If you stumble upon a method call that says
, create and use a new method like this:
3-6, in parallel: Test, test, test.
Write unit tests for everything you touch, or mark pieces of code you can’t test yet with a comment. I use the following:
// TODO (this breaks the code, MUST be fixed before going live)
// HACK (works, but could be better)
// FIXME (known bug located here, or I'm not sure how this ever worked at all)
// TEST (needs to be changed so we can test it, no tests yet)
// WTF (duh)
7. Clean up the code
Now take all classes apart again, one by one, and clean them up. Find duplicate code. Make methods smaller and simpler. Clean up irrelevant and/or misleading comments. Best case is your class, method and variable names are clear enough not to need comments at all. Comment the why, not the how. Now our example class finally has the right to be called “IniFile”.
Resist the urge to do more aggressive refactorings until this point. If you dive in too fast, it’ll bite you where it hurts.
8. The other alternative
If the system you’re tackling is just too complex to refactor and you just KNOW that you have to re-write it (hint: don’t), try to create a barrier between the legacy code and the new code you’re writing. Write every new class the way it should be used, write tests, then modify your “border layer” so that the old, ugly code can access your new code without interfering with its pristine shininess. Then just move the border further and further into the old part, replacing it with new code. Sounds too easy to be true? It is.