Don’t Mock the UNIX Filesystem

When writing unit tests, it is good to call functions with “mocks” or “fakes” — objects with equivalent interface but a simple, “fake” implementation. For example, instead of a real socket object, something that has recv() but returns “hello” the first time, and an empty string the second time. This is great! Instead of testing the vagaries of the other side of a socket connection, you can focus on testing your code — and force your code to handle corner cases, like recv() returning partial messages, that happen rarely on the same host (but not so rarely in more complex network environments).

There is one OS interface which it is wise not to mock — the venerable UNIX file system. Mocking the file system is the classic case of low-ROI effort:

It is easy to isolate: if functions get a parameter of “which directory to work inside”, tests can use a per-suite temporary directory. Directories are cheap to create and destroy.
It is reliable: the file system rarely fails — and if it does, your code is likely to get weird crashes anyway.
The surface area is enormous: open(), but also os.open, os.mkdir, os.rename, os.mknod, os.rename, shutil.copytree and others, plus modules calling out to C functions which call out to C’s fopen().

The first two items decrease the Return, since mocking the file system does not make the tests easier to write or the test run more reproducible, while the last one increases the Investment.

Do not mock the file system, or it will mock you back.

This entry was posted on Friday, December 2nd, 2016 at 5:34 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

2 Responses to Don’t Mock the UNIX Filesystem

glyph says:

December 2, 2016 at 6:35 am

There’s one important counterpoint to this: if you have a program which intimately interfaces with the filesystem, and needs to handle very specific failures (EXDEV, for example, or ENOSPC) which can be hard to produce on a real filesystem, then it makes sense to mock the filesystem. It’s just that much code can blanket handle all failures in the same way or let them propagate as exceptions, or cares only about the happy path, so you don’t often *need* to mock the filesystem.

Reply
mithrandi says:

December 5, 2016 at 4:15 am

I your program calls fsync(), avoiding that will probably speed your tests up dramatically; you can alternatively jump through hoops trying to ensure your tests are run under eatmydata or on a tmpfs, but I’ve never seen that work out consistently.

Reply