More Action, more Overview

Merlin Bögershausen
Java applications have to contend with various concurrency problems. There is a solution here with new constructs.

Java concurrent programming is as old as Java itself. The first version already contained the basic constructs on which today’s concurrent processes are based. Threads and Runables make it possible to decouple the sequential processing of operations and achieve higher throughput. Over the years, a number of changes have been made to Java’s concurrency API to address various weaknesses. The ExecutorService API to save resources or the Future API to avoid having to wait for long-running operations. Now there has been renewed movement in the concurrency topic, virtual threads decouple the JVM threads from the operating system in order to save further resources and make the JVM more efficient. This opens up new possibilities for developing elegant APIs. In this article, I would first like to take a look back and consider which abstractions are available, what the difficulties were and why virtual threads are a solution here. I will then introduce the constructs Structured Concurrency and Scope Values that build on this and look at the whole thing using a few examples. These examples are available on GitHub as GIST and can be viewed in addition to this article.

Threads in general

Threads are a concurrent programming construct used to group operations together and execute these groups in parallel. The operations within a thread run in a certain order. A thread is therefore a small program within a program. A thread is created within a thread or process. This operation is called fork and is outlined in the figure below.

The process of creating a thread in general

Process of creating a thread in general When forking, a new thread is created from a process or thread as its child, whereby a thread can have several children but only one parent thread. The parent thread creates two new threads that exist independently of each other and are executed. After they have completed their operations, the execution reunites in the parent thread. The parent thread can only be terminated once all child threads have been completed. The join operation can be used to explicitly wait for the completion of a child thread or the join can take place implicitly before the parent thread is cleaned up.

However, a thread consists not only of its program code, but also of metadata that is used so that the operating system can execute threads. This metadata includes its own ID, the ID of the parent thread, a specific area in the memory area and a status. This status is necessary to understand why the direct coupling between an operating system thread and a JVM thread can be problematic, the status diagram is given here.

Status diagram of a thread

A thread first starts in the New status. In this status, it is not yet executable and is not executed. This status is intended for initialization. Areas are allocated on the memory, the IDs are set and priorities are configured. Only when the configuration is complete does the thread switch to the Runable status. In this status, the thread can be selected for execution by the operating system. If the thread is selected, it executes a certain number of operations from its program until it is interrupted. Normally, the thread then remains in the Runable status. However, the thread can also switch to the Terminated status after it has completed its work. In this status, the thread is finished and is no longer selected for execution, but still occupies resources. Two further states are Blocked and Waiting, both of which signal a thread that is currently not executable. Blocked refers to a thread that is blocked by another thread. For example, by a syncronized block or access to IO. The Waiting status indicates that the thread is currently waiting for another thread. For example, it is waiting for a response. These two states, Blocked and Waiting, are states that affect the executability of a thread. All five states are mirrored in the JVM. This means that if a JVM thread is marked as Blocked, the corresponding thread is also marked as Blocked in the operating system, as JVM threads are a mirror image of operating system threads. In all operating systems, there is an upper limit for the number of threads. If this limit is reached, no new thread can be created and the operating system is unable to act. For this reason, concurrent programming tries to use threads sparingly and not execute every operation in a new thread.

Java Threads API

The Thread API in Java has existed since version 1.0 and has only experienced small changes. A thread in the JVM always reflects an operating system thread. To define a thread, the class Java.lang is expanded and the RUN method is implemented with the required operations. In the example below, a class of example thread is defined. Like any class, this can be transferred to an instance with the New Operator.

class ExampleThread extends Thread {
    public void run() { /* Do it */}
}

var t1 = new ExampleThread();

t1.start(); t1.join();

This instance of the thread is the smallest schedulable unit of the JVM. However, as already outlined in the last paragraph, it is not the JVM but the operating system that manages the selection of the running thread. After creation with new, the JVM and platform thread have the status New and have been initialized.In our Java program, further initializations can be carried out before the thread is transferred from the New status to the Runable status with the Thread#start method call. If the thread is selected for execution by the platform’s scheduler, processing of the logic starts. To wait for the scheduling of a child thread in the generating thread, there is the Thread#join method and a version with an explicit timeout.

Another way to execute code in parallel is to implement the Runnable interface as shown in the example.

class ExampleRunable implements Runable {
    public void run() { /* Do it again */}
}

var runable = new ExampleRunable();
var t2 = new Thread(runable);

A runnable is a component of a thread and the smallest unit of concurrent work. A runnable groups operations that belong together.The big difference between a runnable and a thread is that a runnable is not bound to a platform thread.Instead, it defines which operations are executed and thus decouples the tasks from the technical execution.

The two concurrency models in Java also aim to decouple technical thread management, execution and the flow of business logic. The ExecutorService is an interface and has been available since Java 5. Its aim is to use the operating system threads as efficiently as possible. Tasks can be provided to it in the form of a Runable, which are executed later. An ExecutorService can use one or more threads to carry out the work. The Future construct has been around since Java 5 and has the purpose of executing long-running operations concurrently. This means that the business code does not have to be blocked while waiting for a response from a database. Both can be combined and lead to an efficient decoupling of long-running business code as well as resource-saving parallel execution. Both solutions share two problems

  1. Exceptions that are thrown in the concurrent operations are encapsulated in other exceptions and thus make error handling not straightforward
  2. A blocked execution also blocks the operating system thread used

In order to be able to use the operating system threads even more efficiently, the direct connection between the operating system and the JVM thread must be resolved.

JEP 444 Virtual Threads

JEP 444 Virtual Threads is intended to break the link between JVM and operating system threads. As a result, virtual threads are no longer bound to an operating system thread. As less information has to be stored, virtual threads are significantly lighter than platform threads. Of course, a virtual thread must also be executed by a platform thread and therefore an operating system thread. The efficient use of the available threads is realized at the JVM level.

The API finalized by JEP 444 in Java 21 is rather small, but the necessary changes within the JVM were all the more enormous. At the core of the API changes for software developers is the new factory method in the listing below.

Thread.ofVirtual()
    .name("virtualThread-1")
    .inheritInheritableThreadLocals(true)
    .uncaughtExceptionHandler((thread, throwable) -> 
        System.err.println(thread.getName() + ": " + throwable))
    .start(() -> IO.println("Hello from virtual thread"));

Thread#ofVitrual is used to create a type of builder for a virtual thread. The name is then set and it is configured that the ThreadLocals are to be inherited. In addition, an explicit fallback ExceptionHandler is created. The start method is used to configure the runable to be executed and start the thread. No separate operating system thread is created here; when a platform thread is created with the Thread#ofPlatform method, an operating system thread is also created. After creation, the virtual and platform threads behave in the same way from an application perspective.

In addition to the changes in the thread class, a new ExecutorService has also been provided. This uses a new virtual thread for each task instead of a thread pool.

Executors
    .newVirtualThreadPerTaskExecutor()
    .submit(() -> IO.println("Hello, World!"));

The changes to the API introduced by JEP 444 are not major, but the features they enable are. The new language constructs Structured Concurrency of Scope Values are based on the virtual threads finalized in Java 21.

Structured Concurrency

The aim of the Structured Concurrency feature, which is currently in preview, is to understand related tasks that are executed in different threads as a single unit of work. This is intended to improve error handling, abort behavior, readability and maintainability. An example of a group of related tasks can be found in the illustration.

Exemplary processing of a call in a service with communication to other services

This example shows two exemplary calls to a service. For each call, this service loads an address and names from databases or from other services. If both pieces of information are available, a response is generated. The address and name are loaded in parallel so that execution is not prolonged. Once they are available, the response is generated. Until the introduction of Structured Concurrency, this procedure would have been realized with some CompletableFutures. The problem with futures is the non-optimal error handling and the poor readability due to scattered interactions with the results. The StructuredTaskScope API (see listing) addresses all shortcomings.

public class StructuredTaskScope<T> implements AutoCloseable {
    public <U extends T> Subtask<U> fork(Callable<? extends U> task);

    public StructuredTaskScope<T> join() throws InterruptedException;
    public StructuredTaskScope<T> joinUntil(Instant deadline)
            throws InterruptedException, TimeoutException;

    public void shutdown();
    public void close();
}

The StructuredTaskScope implements AutoCloseable, indicating that it is intended for use with a Try-with-Resource block. This block makes it clear which tasks belong together and consolidates error handling. Within a StructuredTaskScope, a new task can be initialized using fork. Unlike a thread, this task is configured with a callable and returns a subtask that implements the Supplier interface. In addition to creating tasks, there are also two join methods designed to maintain the currently running processes. Each join call is blocking and waits until all tasks started in this scope have completed. The joinUntil method can also be used to set a deadline and throw a timeout. The two methods shutdown and close do not need to be called independently by developers; the Try-With-Resource block coordinates the shutdown. The listing below is an excerpt from a Concurrent RuleEngine that works with a StructuredConcurrency.

try (var scope = new StructuredTaskScope.ShutdownOnFailure()){
    Supplier<String> adr = scope.fork(() -> readAddress());

    scope.join().throwIfFailed();

    var addres = adr.get();
} catch (ExecutionException | InterruptedException e) {
    throw new RuntimeException(e);
}

In the resource definition of the Try-With-Resource Block, the scope is initialized as a ShutdownOnFailure scope. At this point, ShutdownOnFailure means that the first failed subtask leads to a shutdown and thus termination of all tasks. Other common use cases are implemented in the Java API and custom implementations are possible. In the second line, the task of loading an address is transferred to the scope for execution and the supplier for the result is saved. In order to access the loaded value safely, the join method in line 3 is used to wait. With the throwIfFailed configuration, the scope is instructed to report a failed task as an exception. Once all tasks of the scope have been completed, the get method of the supplier interface can be used to access the produced value. The try-with-resource block also enables explicit exception handling in catch blocks. These are the exceptions that are produced by the scope. This enables central error handling, clearly structures the sequence of tasks and improves overall maintainability. Nothing stands in the way of using virtual threads in our applications to coordinate concurrent work. Scopend Values have been introduced to make efficient use of shared data.

Scoped Values

In many concurrent applications, certain instances or data must be used consistently. One example of this is an SSLContext. Creating an SSLContext instance is time-consuming and resource-intensive. Using a new instance in each of millions of virtual threads is inefficient and pulverizes the lightness gained from virtual threads. To share immutable data between a thread and its child threads, scoped values were designed in Java. The differences to ThreadLocal instances are illustrated in the figure.

Different treatment of instances when using ScopedValues and ThreadLocals

A ThreadLocal instance provides a new instance of the stored values for each thread, as can be seen from the different addresses in T1 and T2 for sslCtx. This construct allows behavior and configuration to be shared, but duplication increases memory consumption. With Scope Values, the approach is to provide the same instance in the child threads. The goals were defined as follows:

  • Ease of Use – the flow of data should be easily traceable
  • Comprehensibility – the lifetime of the shared data should be evident from the syntactic structure of the code
  • Robustness – Data shared by a caller should only be retrievable by legitimate callers
  • Performance – Data should be shared efficiently across numerous threads

The API required to manage and configure a scoped value is kept small, but is all the more powerful.

final static ScopedValue<SSLContext> VAL = ScopedValue.newInstance();

ScopedValue.where(VAL, value).run(() -> VAL.get());

The ScopedValue#newInstance factory method is used to create a new instance of a ScopedValue. This instance is not the value itself, but a data carrier that provides the actual value on request. The ScopeValue#where configuration method is used to configure which value the ScopedValue should have in this scope. A runnable is passed via the ScopedValue. Carrier#run method, which is executed in this scope and gets access to the configured values via ScopedValue#get. In the following code example, two scoped values are used to manage an SSLContext and input data as a map.

ScopedValue<Map<String, String>> INPUTDATA = ScopedValue.newInstance();
ScopedValue<SSLContext> SSL_CTX = ScopedValue.newInstance();

ScopedValue.Carrier executionScope = ScopedValue
  .where(INPUTDATA, Map.copyOf(data))
  .where(SSL_CTX, SSLContext.getDefault());

executionScope.run(() -> {
  String inputName = INPUTDATA
    .orElseThrow(IllegalArgumentException::new)
    .getOrDefault("name", "JON");
  if(SSL_CTX.isBound()) {
    String lastName = getLastName(SSL_CTX.get());
    inputName = "%s %s".formatted(inputName, lastName);
  }
  IO.println(inputName);
});

The two ScopedValue instances are created in the first two lines. In lines 3 to 5, the applicable values are set for the scope. An immutable copy of the input data and a default SSLContext are configured. At this point, a run is not called directly, but the ScopedValue.Carrier is saved for later use. This represents a scope in which a combination of ScopedValues has defined values. This ScopedValue.Carrier is always used if, for example, several tasks are to be executed in one scope. In the last block, such a task is passed for execution. Within the task, the system first checks whether a value has been configured for INPUTDATA. If this is not the case, an IllegalArgumentException is generated and thrown. A check that the values to be accessed have actually been set is necessary, as there is no guarantee of this within a ScopedValue.Carrier. Another way to check whether the value has been set is in the SSLContext. Here, the ScopedValue#isBound method is called before the access is carried out. The basic semantics are similar to those of an Optional. It is important to note that the run call executes the runable in its own virtual thread.

The differences between ScopeValue and the similar ThreadLocal are shown in the table below.

ScopedValueThreadLocal
Immutable CarrierSupplier
Same instanceInstance per Thread
RebindableMutable Value
Comparison of ThreadLocal and ScopedValue

A ScopeValue is an unchangeable data transporter. In contrast to a ThreadLocal, which is a supplier of a specific value. A ScopeValue is always the same instance, whereby the provided value can be reassigned for child threads. Due to their similarity in the application, ThreadLocals should be replaced by ScopedValues wherever possible to enable the efficient use of virtual threads.

Final thoughts

In this post, we’ve looked at Virtual Threads as a lightweight alternative to platform threads. We have seen Scoped Values and Structured Concurrency. These are optimized for use with Virtual Threads to share data and perform asynchronous operations. The examples shown are from an open source project in which a concurrent rule engine has been implemented.

I would like to give you a few tips on these new topics. Virtual threads should not be used sparingly, they are designed as throwaway instances. By reusing them, optimizations implemented in the JVM are undermined and the desired effect is not achieved. I therefore recommend that you switch from platform threads to virtual threads wherever possible. In the simplest case, it is just a matter of replacing an ExecutorService instance.

During your migration, also pay direct attention to whether you could identify ThreadLocal instances that are never changed. These are candidates for a direct migration to ScopeValues. If you use IO operations in concurrent code, this is the place for Structured Concurrency. Do you need support in creating a migration plan or do you have further questions about understanding? Then I am happy to help at conferences and via common social media channels.

Total
0
Shares
Previous Post

Agile, Scrum, Kanban and the Lies We Tell Ourselves About Creating Value

Next Post

From Breaches to Blackouts: The Human Consequences of Software Supply Chain Attacks

Related Posts