Friday, March 22, 2013

Unit Testing code with IO file operations (in Python)

We may often come across a piece of code that was written without Unit Tests at all. In addition, the piece of code may be dealing with IO like file writing and reading, which makes it more difficult to Unit Test it when we are trying to refactor and modify. Let's suppose the code in question looks like this:

def writeInitialsToFile(filename, name, surname):
    initials = name[0] + '.' + surname[0] + '.'
    with open(filename, 'w') as file:
        file.write(initials)


def readInitials(filename):
    initials = None
    with open(filename, 'r') as file:
        initials = file.readline()
    return initials


A straightforward and bad idea would be to write a couple of Unit Tests that make use of a real file and simply test the reading and writing. Is therea a better way to test this code?

First of all, we need a way to replace the real file with something else. For both reading and writing we will now have a couple of functions, one that expects a stream for reading or writing and the other that creates the stream and calls the corresponding reading or writing function:

def readInitialsFromFileStream(fileStream):
    return fileStream.readline()


def readInitialsFromFile(filename):
    initials = None
    with open(filename, 'r') as fileStream:
        initials = readInitialsFromFileStream(fileStream)
    return initials


def writeInitialsToFileStream(fileStream, name, surname):
    initials = name[0] + '.' + surname[0] + '.'
    fileStream.write(initials)


def writeInitialsToFile(filename, name, surname):
    with open(filename, 'w') as fileStream:
        writeInitialsToFileStream(fileStream, name, surname)






Now, we can test at least the pair: readInitialsFromFileStream and writeInitialsToFileStream. In order to test these functions, we don't even need to create a file in the file system - we can just pass something that has similar characteristic, but is not a real file: io.StringIO.

A quasi - Unit Test piece (without the full unittest.TestCase class, to shorten things a bit) would look like this:

testReadingOfInitialsFromFileStream:
testStream = io.StringIO()
testStream.write('T.M.')
testStream.seek(0)
assert('T.M.', readInitialsFromFileStream(testStream))

testWritingOfInitialsToFileStream:
testStream = io.StringIO()
writeInitialsToFileStream(testStream, 'Thomas', 'Mann')
testStream.seek(0)
assert('T.M.', testStream.readline())



Why is it better than using a real file? Some of the reasons (probably not all) are:
  • unit tests should work flawlessly regardless of the environment; when using real file, we may have different permissions depending on the computer and operating system where the tests are run
  • some tests (although this was not necessarily clear from the example) may require that the file has certain name or even certain content - otherwise they'll fail; by using a real file with pre-set name and content we are creating an unnecessary dependency that may impact the maintainability and, in the long run, usefullness of the tests
  • in order to avoid using a real file we had to separate the creation of a stream from writing to / reading from the stream; this forced us to make a first step towards fulfilling the Single Responsibility Principle

See also