introduction
Most developers build queries to access a relational database, using some ORM in different ways to generate the query automatically or write the query manually. Everything could work in most cases in other environments, but at some point, the query becomes slow, or the application takes more time to execute the same task that it did in the past. At this point, the developers, QA, and many others waste a lot of time understanding the change that produced the problem in the last commits, but why not create a test that detects some issues with performance before they appear?
What are the risks of bad performance on a query?
Bad query performance on an application affects multiple aspects, not just the fact that the query takes more time to execute. An application’s endpoints, especially microservices, have several implications on different system flows, affecting stability, scalability, and the user experience.
Some of the most relevant aspects to consider are:
- Increase Response Time: One of the most visible problems is that it introduces a query that takes seconds instead of just a few milliseconds. Other applications could not wait forever and threw a timeout exception, affecting the user experience.
- Increase Resource Consumption: If the query takes more time to execute, it will affect the use of the CPU and memory on the application that is waiting for the response, but it will also affect the performance of the database.
- Hidden Costs in Infrastructure: If a microservice or application has problems with queries, the teams try to compensate by increasing the number of resources available for the databases, adding more replicas, or changing the hardware associated with that database.
- Hard-to-Diagnose Failures: Not all query problems occur, so some may appear only with some specific filters or attributes, affecting the time the developers need to spend understanding what happens behind the scenes with that problem.
These items represent a short list of all the possible problems associated with poor query performance. It is essential to detect the issues quickly and automatically, such as through tests.
Performance Issues Found
Analyze the problems related to the performance of the applications that access the database. Most projects offer a series of recurring issues. These problems do not break the applications, so finding the issues’ origin is difficult, slowly degrading the performance until it becomes critical. Some of the most relevant of these problems are:
- N+1 Query Problem: This problem happens when the application executes a query that obtains a list of entities, each needing more information for another entity, producing additional queries. This problem could make dozens of unnecessary queries.
- Inefficient Join: This could not be a problem if the tables had the correct index and did not require multiple joins to obtain the information.
- Unnecessary Data: Most automatic queries generated by ORM obtain information about all the columns from a database by doing a SELECT *. Depending on the number of columns and the type of query from a table, these queries could increase memory consumption and the time it takes to process the result in the application and database.
- Bad Pagination: This problem is associated with using TOP and OFFSET on the query. The page size could drastically affect the database’s performance in collecting and returning information. At the same time, it implies a high volume of memory to process and keep in the application’s memory.
- Bad use of indexes: If the tables in a database do not contain the correct number of indexes, which could be used for the application’s queries, the application could produce a full scan, increasing the resources needed to find the information.
There are other performance-related problems; the previous list is just an example of the issues found in an application and their effects.
HOW to test queries IN JAVA?
Testing queries in Java is not just about creating a test that verifies whether a feature, such as inserting, updating, deleting, or retrieving an entity’s information, works correctly; it also involves validating an entity’s performance, efficiency, and scalability. Depending on the scope, the approach could include creating either unit or integration tests.
Usually, unit tests are used to validate part of the application, not to use real databases or mock JDBC/ORM interactions. This approach is not the best for detecting performance problems. However, it helps verify if a query is written correctly and returns a valid response without considering performance issues.
On the other hand, integration tests run against real databases using technologies like Testcontainers to set up a database each time tests are executed. This approach and other libraries can help identify execution time, indexing, and query plans.
tOOLS
The tools or libraries that could help to detect possible problems with the performance are diverse; some of them depend on the framework or ORM that the application uses, but the most relevant are:
- Hibernate Statistics API: This API provides insights into the query counts and execution details, but implies that someone must detect the problem before using this library, and it’s impossible to create tests to detect issues proactively.
- Profiling & Monitoring Tools: Tools like VisualVM or APM solutions (New Relic, Datadog, and many others) help detect which part of the application has a problem, but do not provide any feedback about the issue on that query.
- Gatling/JMeter: These tools are ideal for simulating a large number of requests to one endpoint or flow and observing how the application reacts, but they do not provide information about the location of the problem.
- Jfrunit: This library helps detect problems with the application’s performance, and combining it with other libraries provides a good way to create tests, but it’s impossible to detect issues with queries.
- QuickPerf: This library allows the detection of bad use of application resources, such as the allocation of the heap, and the validation of different aspects of queries, such as N+1, joins, execution time, and many others.
There are many other libraries or tools for detecting problems or giving clues about their location. The previous list includes the most relevant of them.
Detecting Performance Problems
Detecting the problem with the performance has two parts: one is reactive, after a problem occurs on production or staging, and the second is proactive, creating tests to detect possible issues. Consider that in both approaches, there is a series of tests to detect the problems in the performance at different levels, such as:
- Unit and integration tests to validate the query behaviour.
- Performance tests will simulate what happens with realistic scenarios and how the query works.
- End-to-end tests validate that the entire flow works and that the query does not affect the APIs’ normal response times.
Let’s start creating the most basic tests, which will reduce the work of the developers.
Introduction to the QuickPerf
QuickPerf is a Java library designed to help developers find performance-related issues when creating tests. The main difference with other alternatives is that this library works inside the problem with other testing libraries like JUnit and TestNG. It also has support for Spring Boot, Hibernate, Micronaut, and Quarkus, so it’s an ideal option for testing not just the application’s queries; it’s possible to check the allocation of memory on one particular method.
This tool does not replace other testing types or tools like APM, but is just a complement to detect problems quickly.
This tool has a series of pros; the most relevant:
- Automated Feedback: Reduce the time to inspect or analyze the queries, and identify the possible problem because the tool, in most cases, offers some valuable information and a possible solution.
- Lightweight & Not Complex: The application’s setup is minimal, and the difference with a standard unit test is simple. In most cases, it implies adding annotations with the things to validate.
- Improve Team Awareness: This tool encourages developers to consider which aspects need to be validated in a test and also offers valuable feedback on which aspects to consider when querying or accessing the database.
- Community: This tool’s community grows each month because it adds support for different libraries or frameworks, so it’s not limited to using Spring Boot or Hibernate. Also, it’s possible to use it with Kotlin if the application uses some of the frameworks that the library supports.
This tool has a series of cons; the most relevant:
- Limiting Scope: Focus mainly on the performance of the queries and some other relevant aspects of using the JVM, but it’s impossible to check the CPU bottlenecks.
- Creating Custom Rules: This is one of the aspects that is not simple to do with the tool because the official documentation does not offer much information.
- False Sense of Security: If the tests pass, it does not mean that everything works fine on the queries because different aspects, like performance on a significant volume of requests, need to be tested.
Considering the tradeoff, the library is a good candidate for creating tests and detecting problems with performance.
EXAMPLE OF USE
Let’s start creating some tests to validate the query performance and represent this library’s behavior in a more or less real scenario. The following repository contains an API that has some problems with the queries and mappings. This API is just a simple catalog of cities, countries, etc.
The first step implies adding all the dependencies related to QuickPerf to detect problems with the queries on Spring Boot.
<!-- QuickPerf dependencies -->
<dependency>
<groupId>org.quickperf</groupId>
<artifactId>quick-perf-junit5</artifactId>
<version>${quickperf.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.quickperf</groupId>
<artifactId>quick-perf-springboot2-sql-starter</artifactId>
<version>${quickperf.version}</version>
<scope>test</scope>
</dependency>
<!-- End of QuickPerf dependencies -->
With just these changes, the next step is to create the test, which will run on a real database, so before we continue, let’s add support to TestContainers to combine with QuickPerf without affecting any real database.
<!-- Testcontainers dependencies -->
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>junit-jupiter</artifactId>
<version>${testcontainers.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>mysql</artifactId>
<version>${testcontainers.version}</version>
<scope>test</scope>
</dependency>
<!-- End of Testcontainers dependencies -->
In the case of both libraries, check the version compatible with the libraries or frameworks the application uses.
To continue creating tests to validate a query, let’s make a class that sets up the containers and replaces the variables in the configuration.
@Testcontainers
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class CityRepositoryTest {
@Autowired
CityRepository repository;
public static MySQLContainer container = new MySQLContainer<>("mysql:5.7").withUsername("mysql")
.withPassword("mysql").withDatabaseName("catalog").withReuse(true);
@BeforeAll
public static void setUp() {
container.start();
}
@DynamicPropertySource
static void databaseProperties(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", container::getJdbcUrl);
registry.add("spring.datasource.username", container::getUsername);
registry.add("spring.datasource.password", container::getPassword);
}
// Add the tests to validate the queries
}
The last step is to create a simple test that validates if the number of queries executed on an INSERT is just one. The following block represents that scenario.
@Test
@ExpectInsert(1) // Validate the number of queries that are executed
@AnalyzeSql
void should_insert_a_city() {
City newCity = new City();
newCity.setName("Test");
newCity.setCode("TST");
newCity.setTimeZone("America/Argentina/Buenos_Aires");
City city = repository.save(newCity);
assertAll(() -> assertNotNull(city), () -> assertNotNull(city.getId()),
() -> assertEquals("TST", city.getCode()));
}
Before running the test, let’s add two annotations to the test class declaration to enable QuickPerf to analyze the SQL queries.
@Import(QuickPerfSqlConfig.class)
@QuickPerfTest
@Testcontainers
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class CityRepositoryTest {
// Previous code
}
Now it’s time to execute the tests and see what happens with the results. If everything works fine, on the console, something like the following will appear:
[QUICK PERF] SQL ANALYSIS
* * * * *
SQL EXECUTIONS: 1
MAX TIME: 1 ms
* * * * *
INSERT: 1
* * * * *
QUERY
Time:1, Success:True, Type:Prepared, Batch:False, QuerySize:1, BatchSize:0, Query:["
insert
into
city
(code, name, country_id, time_zone)
values
(?, ?, ?, ?)"], Params:[(TST,Test,NULL(BIGINT),America/Argentina/Buenos_Aires)]
The test passes but offers valuable information about the execution time and which query was executed.
This library validates the INSERT and can validate other queries using annotations like @ExpectSelect, @ExpectUpdate, and @ExpectDelete.
detecting N+1 queries
This issue sometimes occurs when entities in relationships with others have the strategy to fetch the information set on EAGER.
Let’s create a test on the same class to check this possible scenario. The code has a bug that requires each city to obtain all the country’s associated information.
@Test
@ExpectSelect(1) // Validate the number of queries that are executed
@AnalyzeSql
void should_get_a_city() {
Optional<City> city = repository.findByCode("BUE");
assertAll(() -> assertTrue(city.isPresent()), () -> assertEquals("BUE", city.get().getCode()));
}
After executing this test, the following information will appear on the console.
java.lang.AssertionError: a performance-related property is not respected
[PERF] You may think that <1> select statement was sent to the database
But there are in fact <2>...
💣 You may have even more select statements with production data.
Be careful with the cost of JDBC roundtrips: https://blog.jooq.org/2017/12/18/the-cost-of-jdbc-server-roundtrips/
💡 Perhaps you are facing an N+1 select issue
* With Hibernate, you may fix it by using JOIN FETCH
or LEFT JOIN FETCH
or FetchType.LAZY
or ...
Some examples: https://stackoverflow.com/questions/32453989/what-is-the-solution-for-the-n1-issue-in-jpa-and-hibernate";
https://stackoverflow.com/questions/52850442/how-to-get-rid-of-n1-with-jpa-criteria-api-in-hibernate/52945771?stw=2#52945771
https://thoughts-on-java.org/jpa-21-entity-graph-part-1-named-entity/
* With Spring Data JPA, you may fix it by adding @EntityGraph(attributePaths = { "..." })
on repository method: https://docs.spring.io/spring-data/jpa/docs/current/reference/html/#jpa.entity-graph
On the console, it seems not only are the queries that the application executes, but it also sometimes offers a possible solution to the problem or links to obtain more information.
execution time threshold
Another common issue is the time one or a series of queries takes to execute. To tackle this problem, QuickPerf offers an annotation called @ExpectMaxQueryExecutionTime to specify the expected duration of that query. Let’s modify one of the tests to introduce this validation.
@Test
@ExpectSelect(2) // Validate the number of queries that are executed
@ExpectMaxQueryExecutionTime(thresholdInMilliSeconds = 0) // This check the duration of the execution of the query
@AnalyzeSql
void should_get_a_city() {
Optional<City> city = repository.findByCode("BUE");
assertAll(() -> assertTrue(city.isPresent()), () -> assertEquals("BUE", city.get().getCode()));
}
This test modifies the condition of @ExpectSelect to pass and only shows what happens with the @ExpectMaxQueryExecutionTime annotation. After the execution, an error like the following will appear on the console.
[PERF] Query execution time expected to be less than <0 ms>
At least one query has a greater execution time. The greater query execution time is <1 ms>
This type of validation is recommended for introducing queries, especially complex queries that imply multiple joins.
VALIDATING MULTIPLE QUERIES
Not all cases require validating the repositories or layers that access the database; perphaps it could be a good idea to validate some part of the business logic that accesses the database multiple times.
Let’s create a test that combines multiple annotations simultaneously, for example, the method responsible for inserting a new city, before validating if one with the same information does not exist, and after that, do the insert, so execute two different queries to the database.
@Import(QuickPerfSqlConfig.class)
@QuickPerfTest
@Testcontainers
@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
public class CityServiceTest {
@Autowired
CityService service;
public static MySQLContainer container = new MySQLContainer<>("mysql:5.7").withUsername("mysql")
.withPassword("mysql").withDatabaseName("catalog").withReuse(true);
@BeforeAll
public static void setUp() {
container.start();
}
@Test
@ExpectInsert(1) // Validate the number of queries that are executed
@ExpectSelect(1) // Validate the number of queries that are executed
@ExpectMaxQueryExecutionTime(thresholdInMilliSeconds = 1) // This check the duration of the execution of the query
@AnalyzeSql
void should_insert_a_city() {
CityDTO newCity = new CityDTO();
newCity.setName("Test");
newCity.setCode("TST");
newCity.setTimeZone("America/Argentina/Buenos_Aires");
CityDTO city = service.insert(newCity);
assertAll(() -> assertNotNull(city), () -> assertEquals("TST", city.getCode()));
}
}
If everything is okay with the tests, the following will appear after the execution on the console.
QUICK PERF] SQL ANALYSIS
* * * * *
SQL EXECUTIONS: 2
MAX TIME: 1 ms
* * * * *
SELECT: 1
* * * * *
INSERT: 1
* * * * *
Consider adding other annotations to validate multiple things, depending on the complexity of the method.
BEST PRACTICES
QuickPerf is a great library, but there are some aspects to consider when using it; some of the most relevant best practices are:
- Start Small: One of the biggest mistakes is trying to add tests for all the queries on an existing API. This process takes too much effort. The best approach is to add tests for new features or when a new issue appears.
- Create a new profile: The best way to do this is to create a new profile on the application to run the tests related to the queries exclusively. So, if something requires a quick deploy, it’s possible to skip it and reduce the pipeline’s time.
- Check with the DBA: If the error is complex or unclear about what the problem could be, show the query to the company’s DBA to obtain feedback about a possible solution.
- Add all the possible validations: When creating a new test that uses this library, try to add all the validations, like the maximum duration and the number of queries to execute in the database.
This list just represents a small list of all the possible best practices, but the best way to discover which of them are effective is to create tests and see what happens with them.
What’s next?
Many resources cover software performance, database optimization, and the impact of inefficient queries on applications. However, only a few focus on testing and detecting database performance issues in Java applications. The following is a short list of valuable resources:
- QuickPerf: Testing and Improving Performance in Java
- High-Performance Java Persistence by Vlad Mihalcea
- SQL Performance Explained by Markus Winand
- Effective SQL by John Viescas and Douglas Steele
Other resources that can help build a broader understanding of good practices and application performance are:
- Clean Architecture by Robert C. Martin
- Designing Data-Intensive Applications by Martin Kleppmann
This is a small sample of the available performance and database testing resources. If something remains unclear, explore other books, articles, or tutorials.
CONCLUSION
Access issues to the database are not always something that developers, QA, and other profiles pay attention to when creating an application until the problem becomes critical.
Creating tests using QuickPerf or any other tool reduces the risk that something bad happens in production with the queries. Still, it implies that the developers consider all the possible scenarios in each access to the database. Also, it’s necessary to create this type of test not only to verify the layer responsible for accessing the database but also to validate other layers that execute many accesses into the database to obtain the information, like the layers responsible for the business logic.
Try using this type of test on the critical parts of the application and then propagate to the rest. Understanding which parts are relevant or critical inside an application takes time, so do not feel frustrated if it does not work on the first attempt.