Reading about testing advice on the internet is often fraught with strong opinions on the “right way to test”. I want to take a different approach and instead try to explain why I test the way I do instead of trying to convince you that my approach is the right one.
Discussions about testing tend to quickly devolve into arguments about unit vs. integration testing or whether or not you should hit the database in your tests. I don’t find these kinds of discussions particularly fruitful because they lead to people thinking they have to “pick a side”.
I think the more useful question to ask is why do we test the way we do? We didn’t randomly start writing tests a certain way. So why do we think that unit tests are the one true way and integration tests are evil (or vice-versa)? The answers to these question tell us something about what we value in a test suite and what our priorities are.
So in this post I want to talk about which properties I value in my test suite and how that informs the way I write them.
Confidence to refactor
A test suite is not a guarantee that your code works. If your test suite is green, that doesn’t mean that your code doesn’t have bugs. All it does is tell you that the things you thought of testing still work. Probably.
Say you make a change to some part of your application and the test suite stays green. How confident are you that the system as a whole still works? Because the only thing you know at that moment is that you didn’t change the behavior of the code from the point of view of your existing tests.
So now the question becomes how well your tests reflect the actual environment that your code runs in in production. Or does the test still pass because it depends on the idealized situation you’ve constructed in your test case, for example due to excessive mocking?
A test suite gives you a certain percentage of confidence that your application works the way it is supposed to.
How does this influence the way I write tests? I want to write tests in such a way as to maximize this percentage of confidence. For me that means that I write mostly integration tests.
Integration tests are larger tests that often test multiple components at once. They tend to make fewer assumptions about collaborators and the environment; whatever needs to happen actually happens. This in turn gives me greater confidence because the tests make fewer assumptions and more accurately reflect the environment in which the code actually runs in production.
The more confident I am in the tests, the more confidently I can refactor.
Tests as documentation
I have found that a great way of familiarizing myself with a complex codebase is to read through its test suite. One often underappreciated property of a good test suite is that it serves as executable documentation. “Executable” being the keyword here.
I bet you have experienced the following scenario at least once in your career.
// Add 1 to foo $foo += 2;
Except that it was probably much less obvious that the comment was incorrect. These types of comments usually were correct at some point, they simply weren’t updated when the surrounding code changed.
I don’t know how many hours of my life I have lost because I was assuming a comment to be true only to find out later that it wasn’t. But tests are executable. I can check if what they’re claiming is still correct.
How does this influence the way I write test? If I treat tests as first-class documentation, I want them to be as easy to read and understand as possible.
One way to achieve this is to really think about what details are actually relevant to a test. I have talked about this in more detail in another post so I encourage you to hie thee hence over there and check it out.
This is also another reason why I dislike heavy mocking. Not only does it reduce my confidence in the test, it also makes the test much more difficult to read. Now I have to understand how exactly this mock is supposed to be called and what it returns which detracts me from the thing that is actually being tested.
Treating tests as documentation also means that they should be self-contained. I typically don’t read a test suite from top to bottom. Each test should be able to provide enough context on its own so it can be understood without having to jump between 10 different helper functions and files.
This is why I believe that tests should not be DRY. By removing duplication you introduce abstraction that is misplaced inside your test suite. It’s ok to repeat yourself in your test cases if it makes the tests more self-contained and easier to read.
Driving the design
I also use tests as a proxy for gauging how well-written my code is. Because to me good code is testable code. I view them as essentially the same thing. So this point is less about how I write tests but how tests influence the way I write (application) code.
In testing circles, there is this concept of listening to your tests. Tests that are difficult to read or write are very often indicative of problems with the code it’s trying to test.
How does this influence the way I write tests? In this case, my approach is to ask the question “How would this code have to look so that this test would be easy to write”, and then follow that thread. This is an incredibly powerful idea that has lead to many a design breakthrough in my code.
What about “fast”?
I guess I should mention the elephant in the room: speed of execution. Why isn’t this on my list? Do I not care about writing fast tests? Well, kind of.
It’s not so much that I don’t care, but that I don’t prioritize having fast tests as highly as the other properties that I’ve talked about so far. If I can make the same test fast without sacrifcing any of these properties, then sure I’ll do it. But if it causes the test to be harder to read or reduces the amount of confidence it gives me—by relying on too much mocking, for example—then that’s a trade-off I am not willing to make.
The argument I often hear against writing mostly integration tests is that your test suite becomes slow. This part is hard to argue, it’s just a fact. What I have issues with is the conclusion that is often drawn from this. That if your tests are too slow, people will run them less often.
This sounds like it makes sense on paper, but in my experience this turns out to be a complete non-argument in practice. Because it assumes that every time you run your tests, you run the entire test suite. This is not how I work at all. When I work on a piece of code I usually run only a subset of tests. And then from time to time as I get up to take a stretch I run the whole thing.
You might argue that you might miss when you’ve broken something in a different part of the app by doing it this way. But again, I have never had that be an actual problem in practice.
I am aware that I am making a trade-off here. I’m sacrificing speed of execution in favor of these other properties mentioned above. And that sacrifice is 100% worth it to me.
This post is not intended to convince you of my way of testing, but rather provide my reasoning for the decisions I make. You might have a different set of priorities or come to different conclusions. This is not a bug, it’s a feature.
This list is not about what is right or wrong, it is about priorities. The properties I listed are the things I value the most in my tests. What do you value in a good test suite? What properties do you tend to optimize for?
I strongly believe that framing the conversation about testing strategies like this leads to more fruitful discussions.