Unit Testing and Refactoring

The difference between quality software and amateurish garbage is not primarily in the quality of the code, we can all write bad code on a bad day. In my opinion the difference is in the thoroughness of the unit and integration testing that surrounds the code. If the tests are good, then the code can be refactored after it is written to bring it into a better, more manageable shape.

Your Worst Nightmare...

...happened to me:

Our team was handed a project which was in an awful state. The codebase was almost 500,000 lines of code and not a single test. The code was badly written, confusing, and extensive. We were tasked with not only maintaining this code, but also keeping on top of a demanding and growing list of new feature requests from the client.

After considering our options, up to and including resignation, we decided to treat the situation as a challenge. Luckily our engineering manager at the time was a great proponent of code quality and testing, and we agreed a mandate that no new code was to be written, and no existing code was to be changed, without tests.

All well and good, but the realities soon hit home: this code didn't want to be tested. How did this happen?

Code without tests is not easy to change. That means that developers are afraid of the code, and make the minimum changes necessary to get the new feature or fix in place. This in turn means that methods grow in place, code is duplicated by copy and paste, calls to external resources (file and database access) are written in-line, and what started out as simple methods become deeply nested monstrosities, sometimes four and five levels deep and hundreds of lines long.

I say methods grow in size, why is that? Simply because the proliferation of temporary variables makes it difficult to extract a block of code into a separate method and guarantee to find all of the relevant temporaries.

So our first task was to get a test framework set up. We opted to do it properly and use the Perl Test::Class package. We then identified three major impediments to making any given piece of code testable:

  1. Some of the code was procedural - no classes, just libraries.
  2. The code was opening and using external resources (file handles, file system tests, time, databases, large production-only subsystems) in-line.
  3. Where the code was object-oriented, it was calling new() directly, making objects impossible to mock out.

The first of these problems (no o-o) was fairly easily, if only partially, solved by writing o-o wrappers around these procedural libraries.

The other two problems were addressed by creating what is generally called a global Factory that would allocate objects of the requested type, and provided a level of indirection between the code under test and the creation of new objects, open file handles etc.

In order to make these global factories convenient to use we decided that they should export a single function (not method) whose exact name could be decided by the importing package. That function would act as a shorthand to the factory's new() class method, so a typical use of the factory might be:

Before
package MyUntestablePackage;
use SomeOtherPackage;

    ...
    if (-e $datafile) {
        open DATAFILE, $datafile;
        my $blah = new SomeOtherPackage($datafile, time());
        ...
After
package MyUntestablePackage;
use Factory qw(fetch);

    ...
    if (fetch->file->exists($datafile)) {
        my $DATAFILE = fetch->fileHandle($datafile);
        my $blah = fetch->someOtherPackage($datafile, fetch->time());
        ...

The idea of exporting fetch() was If you want people to do the right thing, make it the easy thing. Typing fetch is only one more character than typing new and therefore met little resistance within the team.

So now we have the Factory, what can we do with it?

Well, the factory is a special kind of Singleton, with a special mockMode() method:

package Factory; # we didn't really call it this
use strict;
use warnings;

# special code to export fetch() ...
# ...

{
    my $self = 0; # lexical scope

    sub _new {
        my ($class) = @_;
        $self ||= bless {}, $class;
    }

    sub mockMode {
        require Mock::Factory;
        $self = Mock::Factory->new();
    }
}

sub fetch {
    __PACKAGE__->_new();
}

sub fileHandle {
    my ($self, @args) = @_;
    return FileHandle->new(@args);
}

# etc. for other factory methods

So mockMode() unconditionally assigns a mock version of the factory to the lexical $self. This can be used in the test fixtures:

sub setup : Test(setup) {
    my ($test) = @_;
    fetch->mockMode();
    my $fh = Test::MockObject->new();
    # add functionality to $fh
    # ...
    # remember the file names:
    $test->{openfiles} = [];
    fetch->mock(
        fileHandle => sub {
            my ($fetch, $filename) = @_;
            push @{$test->{openfiles}}, $filename;
            return $fh;
        }
    );
    # etc.
}

Of course the mock factory has features similar to Test::MockObject that allow factory methods to be mocked.