Oh no! Not another article about testing! You can find a lot of publications about automated testing in software development, but there is a good reason for it. Testing your code is one of the most important aspects in software development!
I’m not going to write yet another article about how you should write your tests. I’m also not covering the details of various types of tests. I will elaborate on some pitfalls you should be aware of when writing tests and how to solve them. This is not a scientific approach; it is just my opinion. What’s covered in this article? Let’s start with discussing why testing is so important in software development. Next, I’ll elaborate on some risks and how we can avoid them. I will also dive into sociable testing and test data generation.
Will you learn something new from reading this? Maybe, maybe not! But I have one goal in mind while writing this text. I want you to think about how you tackle your testing, so you can grow as a developer!
Over the past few years, I have been teaching courses, coaching colleagues and giving presentations. My focus here? Quality! Quality is not only about testing, but testing is a very important prerequisite to be able to deliver qualitative software. When I was talking to people who attended my presentation, they mostly said “I knew what you were saying, but I didn’t know that I was doing it wrong!”. So, I hope that you’ll benefit from my opinion by thinking about how you can improve your own way of working.
Importance of testing
As I mentioned in the introduction, testing is a prerequisite to deliver quality. While every developer knows this, I still see that testing is treated very lightly in any programming course. Most courses just slightly touch testing. They introduce testing libraries like Junit, saying that it is helpful to write tests. They give some examples and then just jump into the next topic. There’s nothing more to say about testing, right?
When I mentor interns or when I’m coaching a junior developer, I often see them struggling with testing. They often know the basics about how they should be writing their tests, but they don’t know when they should write a test or even which test they should be writing.
Why is this a problem? For me that’s pretty obvious. My tests are the only way of proving to myself, my team and my customer that the code I’ve been writing does exactly what I want it to do! It also guarantees that the functionality that I have been writing before still works after adding more features or refactoring the code. That is why we need tests!
Testing pitfalls
However, I always prefer to have an extensive test suite over having no tests at all, not all tests are equally good. There are some things you should be aware of while writing tests, to ensure that you are writing good tests.
Test first
Let’s start easy! You should always try to write your test before you write your implementation. Otherwise, you risk that you’re testing the wrong things. We’re all human, so we tend to make mistakes.
Let’s imagine that we are writing a complex calculation. We started out writing our first test, which should obviously fail. When we start implementing the code, we enjoy the process of implementing the calculation, leading us to writing too much code. At least, more than we should be writing, because we should only write the code we need to make the test pass. Now here’s the problem! We make a mistake in our calculation, so without knowing it, we introduce a new bug!
Afterwards, we start adding more tests, in order to cover the entire calculation, including every single edge case. If you’re looking at your code to write your tests, you risk testing that the bug exists in the code, as if it should be the way we’ve implemented it.
It can get even worse! We’re developers, so we are lazy! We’re also modern developers, so we’re using an AI assistant to speed us up. Almost every “self-respecting” AI assistant has a feature to generate unit tests based on a given code snippet, so we ask the AI to write tests for our entire calculation. Why should we write our tests if the AI can do this for us?
Testing the right thing!
Another problem is that we’re testing the wrong thing. I’ve had a colleague in the past that wasn’t familiar with writing tests. While he was writing a relatively simple feature to keep the clock in the client-side in sync with the server clock, he also had to test this.
He had a simple endpoint retrieving the server time, using LocalDateTime and the Clock class. He was searching the internet how he could solve this, and he ended up just copy-pasting the example test that he found online. The test worked just fine, but he wasn’t testing his own code, he was testing that the Java implementation of the fixed clock worked!
There is one more thing I would like to mention. This last issue is caused by how we write our specific tests. Everyone says that a unit test must run in isolation, making it fast and easy to write! This leads to a terrible disease called Mockitis. Mockitis can be easily diagnosed. When writing a test where you set up doubles for everything, you’re mocking too much. Mockitis leads to fragile tests, making it hard for us to refactor our code.
We all know that refactoring should be about improving the structure of our code, without changing the behavior. In an ideal setup, we should be able to refactor our code without breaking our tests! Well, welcome to the real world! However, libraries like Mockito can be particularly useful, they can also slow us down when not using them with caution. We need mocks at some points in our tests but be aware of the risk that your tests rely on your specific implementation.
So, if your tests break because you’re refactoring, know that your tests were testing the technical implementation instead of the real behavior of the code!
The solution: focus shift
I have read in many publications that test driven development is the solution for most problems with testing. I think this can be a very good place to start, but if you want to skyrocket your quality, I think it’s not quite good enough. I’m not preaching that we should all abandon test driven development, not at all! I think we should be adding another layer on top of it!
Test driven development helps us to think about our code before we are writing it, but it does not specify how to write tests and which tests to write. I have seen many introductions to TDD in the past, where the focus is only on writing unit tests. The examples are often too easy, so easy that you cannot do it wrong. There is no complex problem for you to solve, you don’t really need to think about the architecture. No, you just must write the test before you write the code.
I am not saying TDD is bad, because I know it is not. It is particularly important to think about what you should be implementing before you start implementing it. The easiest way to do this is to write a test first.
Then what am I promoting? I think we should start focusing more on testing the behavior of our code instead of the actual implementation. We want to know that our code produces the results we want it to. I do not really care about how the code gets to this result, because this is highly likely to change over time.
Focus on behavior
Over the past few years, I have become a huge behavior-driven development enthusiast. For many teams, behavior-driven development feels like overkill. But they could not be more wrong. Why do they think about BDD this way? If you do a quick search for BDD online, you will find cucumber listed remarkably high. Cucumber is one of the tools that facilitate the use of BDD in software projects. I am not going to say that you must use Cucumber (or any other tool) to be able to apply some of the best practices of BDD!
BDD is not a library or a framework that you need to depend on, no BDD is a way of working. Where TDD states that software development should be driven by writing tests, BDD prefers that the desired behavior is your primary driver. What are the expectations we need to fulfill with the features we are implementing? However, BDD does not say that it is driven by tests, it uses tests to guarantee that we implement the expected behavior. BDD emphasizes more that we need to implement exactly what our business wants us to implement. We use the business acceptance criteria as a baseline for our (acceptance) tests.
Getting started with BDD
Applying BDD can simply be done by translating all the acceptance criteria into automated tests. It does not matter whether you use a built-in tool like MockMVC or that you prefer implementing the acceptance tests with a specialized framework like Cucumber. This is a decision that every team should make for itself. If some team members already know how to work with Cucumber, I do not see any reason not to use it!
These libraries are built to facilitate the transition to BDD. They offer a way to make the tests and the test results understandable for non-technical stakeholders because a better understanding of what we are doing as a software development team should increase their interest in our work. As a result, they should be more involved in the development process, giving us the ability to deliver exactly what the business wants from us quickly.
If we transition to a more behavior-driven testing approach, we will find ourselves writing better, more resilient tests that can stand the test of time. Tests focused on behavior do not depend on a specific implementation, so when we start refactoring, our tests should not start breaking too!
Upgrade your tests!
But do we really need to introduce BDD to write robust tests? Of course not! The acceptance tests are very often integrated tests that must spin up an application context. This takes time and makes our build slow. Eventually leading to the tests being skipped in early build stages, which is a recipe for disaster!
But how can we focus on behavior instead of implementation with fast tests? Nothing is faster than a unit test and a unit test runs in isolation! That is true, but there is more to say about this.
What is a unit?
A unit test is a test for a specific unit, but what is this unit? There is no single definition of a unit. For some developers, a unit is a single class, sometimes even a single method. Personally, I do not agree with this anymore. A unit can also be as big as an entire flow through the application, where we only avoid depending on integrations with external systems or infrastructure like our database.
Integrating classes to test an entire flow does not mean our tests are no longer isolated. The isolation means that every test runs independently from other tests. They all create their own instance of the unit they are testing, not depending on the state that can be changed by another test.
Solitary tests
Martin Fowler has written a great post about this on his website[1], where he elaborates on two distinct types of unit tests. The first one is the type of tests that we learn to write when we first start programming. These are the solitary unit tests[2]. A solitary unit test often tests the methods of a single class. All dependencies are replaced by test doubles, like Mockito mocks. This allows us to test the code of a single class, without depending on the implementation of its dependencies.
As I mentioned earlier, the solitary tests can lead to Mockitis, something we should always try to avoid. The only known cure for Mockitis is abandoning the solitary unit tests in favor of another type of unit test.
Sociable tests
Fowler described this second type of unit test as sociable unit test. This means the unit under test relies on its dependencies instead of replacing them with a test double. So, if I’m testing a service that relies on another independent service to perform a complex calculation, I do not need to mock out this service. I can simply instantiate my system under test and all its dependencies to be able to test the behavior of the unit I’m testing.
I said all its dependencies, but I should have written this a little more carefully. Almost all its dependencies would have been more accurate. I don’t want to depend on the implementation of a real external service, or I don’t want to depend on the data in a database, even if it is a database test container. I want my tests to be fast, so I do not integrate with components that can slow my test down. These dependencies will still be replaced by a test double. Otherwise, we would need an application context to be able to connect with the correct database or integrate with an external REST API.
Although making assumptions as a software developer is very often the root cause of problems, while writing sociable unit tests, there is one assumption that we can make! We can assume that all our dependencies behave as expected. Why? All dependencies should have their own tests! This is why we shouldn’t be bothering about replacing all dependencies with test doubles, leading to more concise tests with less boilerplate code to mock the behavior of our dependencies.
Testing behavior, not implementation
Instead of mocking, we write our tests with a focus on the behavior. As I mentioned earlier, this leads to less code in our tests. Less is more. When there is less code being written, our understanding of the code should be better.
However, I must mention that we can’t entirely abandon test doubles. When using frameworks in our project, like we all do, most of us rely on the framework to work its magic to provide some key components of our applications, like database connectivity for example. When writing (sociable) unit tests, we don’t want to spin up an application context to let the framework do its thing. Nevertheless, we will need an implementation for our repository layer. This is something that we will likely replace with a test double.
In my opinion, everything that is within the scope of our application and doesn’t directly communicate with an external component, should not be replaced. If the class is responsible for handling interactions with an external component, we will very often use test doubles, just to avoid the need for an application context. Using an application context slows down the test execution too much to benefit from it within a unit test.
I know that setting up sociable tests can be cumbersome, especially in larger projects. I’m currently exploring some possibilities to simplify this, but it’s too early to write down all the details here. Are you interested in how this could help you or do you have any ideas? Feel free to reach out and we’ll discuss this in depth! You can find me on LinkedIn, X and Bluesky!
[1] https://martinfowler.com/articles/2021-test-shapes.html
[2] Jay Fields came up with the terms “solitary” and “sociable”
Test Data
Another keystone of good tests is the data being used to run the test. Where does the data come from? How is the data being initialized?
I often see developers creating their test data in their tests. This again dirties the test code, especially when you are working with complex nested objects. I also very often see all tests rely on static test objects. This has some critical disadvantages.
Testing with static data limits me to be confident in the tests. I am confident that my code works well with the specific data that is being used in the test, but what if one of the values is changed? Will everything still work? I’m not sure about that!
I prefer to have some more random values in my tests. But how should we use random values in our tests? What are the different approaches, and which is the best choice? Keep reading to find out more about how I think about test data!
Mockaroo
A first step into writing tests with more random data can be taken by using something like Mockaroo. Mockaroo is a platform that can be used to generate random but realistic test data. You can specify what the test data should look like. Mockaroo provides a set of generators that populate the data with realistic values, but you can also create complex formulas that take values from other columns into account to calculate the value of a field.
You can simply define the schema of the data you want to generate and download a set of random values in different formats, like SQL, CSV, JSON,… This data can then be included in your project. This is the easiest way to use Mockaroo. Even though your data is generated randomly, it becomes static data once you start using it. Mockaroo can also be used by doing API calls, but this will again introduce some complexity in your tests. Doing so can also cause test failures when the Mockaroo API is not working correctly. This is again something we want to avoid at all costs, because we only want our tests to fail for one reason, being an issue in our own code.
Data generation libraries
Another way to create random data is by using libraries in your code. The first library I encountered to generate realistic random data, was datafaker. Datafaker is a library that can be used to generate realistic pseudo random values, it is shipped with lots of built-in data providers. You also have the possibility to create and register custom providers to generate specific values.
Datafaker
I used datafaker in the past. Back then it was only possible to generate single random values, not entire objects populated with random values. This possibility has been added in 2022. Since then, we can specify a schema where we setup how each value should be populated. This gives us the possibility to create complex nested objects with a single command.
Instancio
Because datafaker didn’t provide this possibility in the past, I started searching for another library that could do this. I found a powerful library called Instancio that did just that. The downside of Instancio in comparison to datafaker is that when generating string values, the result is really a random String, while when using datafaker, you can specify which provider it should use to generate the String. However, this is also something that could be done in Instancio by creating a custom generator.
We can also choose to combine both datafaker and Instancio, as they both depend on the default java.util.Random class to generate the values. A big advantage in Instancio is that we can easily regenerate the same value by passing a seed value to our test. This is something that is a little more cumbersome in datafaker.
Which library you choose, doesn’t really matter to me. The main thing you should be aware of is that it should be very easy to switch between libraries without having to change all your tests. How can you do this?
Introducing fixtures
To avoid having to change every test class when you decide to replace your data generation framework, you should always try to avoid using the framework directly in your tests. The cleanest way to do this is by using a fixture class, or what Martin Fowler calls the Object Mother pattern[1]. The creation of the test object can then be moved into the object mother, using a static factory method that is exposed to be used in one or more test cases.
However, as Martin Fowler stated that using Object Mother can cause some typical faults, using Object Mother with more random data will very likely not cause this issue. But why is that? The simplest object mother factory methods create an object with static data. Tests will rely on this static data. When using random data in your factory methods, you won’t be able to write your tests with static expectations. You will always have to use the values that are present in the created object to do your assertions.
Reduce boilerplate
Another issue can occur when you are using object mother in combination with immutable objects (a java record class for instance). If you need a test object with some specific values being set, you must create a factory method with parameters for this specific combination. This could lead to an unmanageable pile of factory methods. This is also something we should try to avoid!
At my current project, where we are using Instancio, we started thinking about how we could solve this issue. We came up with a builder-like construct, where we can set specific values in a fluent way. The fields that we don’t care about for that specific test case, will be populated with a random value when the build method is called. It took us some iterations to make it good enough to work for all of our specific use cases, but in the end, we had something that was good enough for us to use.
Open Source
I ended up extracting this setup to a separate project that has been published as an artifact[2], ready to be used by anyone! There is still room for improvement, because the builder still has to be implemented for every class, so there is some boilerplate code required. The plan is to investigate the possibility to generate the entire fixture builder at compile time, just like Lombok generates code based on the annotations we add to a class.
[1] https://martinfowler.com/bliki/ObjectMother.html
[2] Visit https://wouter-bauweraerts.github.io/instancio-fixture-builder/ for more details
Conclusion
Automated tests are a great tool to improve the reliability of our codebase. There are some things that we must be aware of. One of the things that we should avoid is that our tests are written (or AI generated) based on the production code. We can simply do this by using the tests as the entry point for our implementation. To do this, we should adopt a BDD or TDD workflow.
Another thing to keep in mind is that our tests should not rely on a specific implementation. This can be achieved by writing sociable tests, that focus on asserting that the code behaves as expected instead of testing the specific technical implementation with solitary unit tests.
Finally, we can also improve our tests by adopting a data generation library like Instancio or Datafaker to populate our test objects with random – but realistic – data. There are some patterns that we can use to make an abstraction of this testing library in our tests, so that when we decide to use another testing library, we should not be worried that every test must be changed accordingly.
The Object Mother pattern is the easiest way to achieve this, but we can provide an additional layer of flexibility to this by adding a fluent way to customize the object creation without the risk of cluttering our code by piling up factory methods for every possible parameter combination we need in our tests.