Writing Readable Code with Algebraic Data Types and Pattern Matching in Java

Writing readable code is essential for the maintainability of any application. There are many techniques we can use to be able to write readable code. This article discusses one such technique that uses Algebraic Data Types and Pattern Matching, a few of the important features from Project Amber [1]. We will first discuss what Algebraic Data Types are, and then we will discuss some Pattern Matching features from Java. Finally, we will see an example where we can use them to write readable code.

Algebraic Data Types

Algebraic Data Types (ADTs) are data-types that are created by combining existing data-types, mainly by applying algebraic operations to them. There are two main categories of ADTs: Product Types and Sum Types. Let’s explore them both.

Product Types

In Product Types, each possible value of the resulting data-type has a value for each of its constituent types. For example, we can create Point type by combining two int types (two coordinates of a point).

In Java, a product-type can be implemented by using record [2].

record Point(int x, int y) { }

Another example is an Address has a houseNumber, a street, a city and a country. In Java this can be implemented as:

record Address(int houseNumber, String street, String city, String country) { }

Sum Types

In Sum Types, each possible value of the resulting data-type is one of the values of its constituent types. For example, let’s say a shape can either be a triangle, a quadrilateral or a circle.

In Java, we can implement this by creating sealed interface [3] called Shape with 3 implementations: a Triangle, a Quadrilateral and a Circle. This way, any instance of Triangle, Quadrilateral or Circle is of type Shape. Also, a Shape instance can only be one of these 3 implementation types and nothing else.

sealed interface Shape {
    record Triangle(Point p1, Point p2, Point p3) implements Shape { }
    record Quadrilateral(Point p1, Point p2, Point p3, Point p4) implements Shape { }
    record Circle(Point p1, int radius) implements Shape { }
}

Pattern Matching

Pattern Matching is checking for the presence of a pattern in a given sequence of tokens. There are many pattern-matching features in Java introduced as part of Project Amber. Let’s discuss some of these features, which are needed for our example later on.

Pattern Matching for Instanceof

instanceof is an operator that you can use to check if an object is an instance of a class or interface. This way you can safely cast the object to the class or interface. Typically, this is followed by using the object and accessing the members of the class or interface.

With this feature, you can now do all that in the same statement. For example:

if (object instanceof Point) {
    Point p = (Point) object;
    double dist = Math.sqrt(p.x * p.x + p.y * p.y); 
}

Here we are checking if object is an instance of Point, and if it is, we create a variable p of type Point (by casting object to Point) and we can use that variable in the subsequent code. All of this can be done with this [4]:

if (object instanceof Point p) {
    double dist = Math.sqrt(p.x() * p.x() + p.y() * p.y());
}

Record Patterns

In the above example, if Point is a record, we could deconstruct [5] it into its constituent members. As shown in the example below, we can access x and y coordinates of the point and calculate the distance of the point from origin.

if (object instanceof Point(int x, int y)) {
    double dist = Math.sqrt(x * x + y * y);
}

We can also apply the deconstruction pattern recursively. For example,

if (shape instanceof Circle( Point(int x, int y), int radius)) {
    double dist = Math.sqrt(x * x + y * y) - radius;
}

Here, we check if the given shape is a circle, and if it is, we deconstruct the circle into a point (center of the circle) and radius. And then we further deconstruct the point into its x and y coordinates.

Pattern Matching for Switch

Record patterns can also be used in switch expressions and statements [6]. For example,

int sides = switch (shape) {
    case Triangle(Point p1, Point p2, Point p3) -> 3;
    case Quadrilateral(Point p1, Point p2, Point p3, Point p4) -> 4;
    case Circle(Point p1, int radius) -> 0;
};

In the example above, we calculate the sides of a shape. When the shape is triangle, we return 3. When the shape is quadrilateral, we return 4 and when the shape is circle, we return 0.

Unnamed Variables and Patterns

When a variable is unused, we can be explicit about this by using an underscore as a name of the variable. That way we can inform the compiler (and readers of the code) that we don’t plan to use it.

For example, the above example can be rewritten with all variables in record patterns as underscores [7]:

int sides = switch (shape) {
    case Triangle(Point _, Point _, Point _) -> 3;
    case Quadrilateral(Point _, Point _, Point _, Point _) -> 4;
    case Circle(Point _, int _) -> 0;
};

We can make it even more concise by not deconstructing the records as below:

int sides = switch (shape) {
    case Triangle _ -> 3;
    case Quadrilateral _ -> 4;
    case Circle _ -> 0;
};

Using Them All Together

Let’s now take a look at how we can use all these features together.

What we want to do is write a program that can evaluate such expressions, for given set of variables-values:

4x + 3y + 2z + 1
23ab + 34bc + 45ac

If we analyse the expressions above, we notice that there are some variables, and these expressions are created by applying some operations to these variables. The variables in the above examples are x, y, z and a, b, c. The operations that are used in above examples are addition and multiplication.

If we think of all possible expressions that we can form by these rules, we see that a valid expression can be a constant, a variable, addition of two valid expressions or multiplication two valid expressions. We can denote this with a grammar like this:

Expression = a constant
             | a variable
             | an expression + an expression
             | an expression * an expression

We can create all valid expressions using this grammar, and any expression created by this grammar will be a valid expression.

This tells us that an expression can be one of four things: a constant, a variable, addition of two expressions and multiplication of two expressions. And that means an expression is a sum-type. So let’s implement this using a sealed interface as below:

sealed interface Expression {

    record ConstExpr(int value) implements Expression { }
    record VarExpr(String name) implements Expression { }
    record AdditionExpr(Expression left, Expression right)
            implements Expression { }
    record MultiplicationExpr(Expression left, Expression right)
            implements Expression { }
}

Here we have four implementations of Expression, viz., ConstExpr (a constant expression), VarExpr (a variable expression), AdditionExpr (an addition expression) and MultiplicationExpr (a multiplication expression). These four are records. ConstExpr can only have a single integer value. VarExpr has a variable name as its member. AdditionExpr and MultiplicationExpr both have two expressions (left and right) as their members. So these four are product-types, and expression is a sum-type. By using Algebraic Data Types for the definition of Expression, we ensure that any instance of Expression will be a valid expression.

Now for the evaluation part, we create a default method evaluate() which takes two arguments: an expression to be evaluated and a Map<String, Integer> of variable-values. We can implement this method as below:

default int evaluate(Map<String, Integer> values) {
    return switch (this) {
        case ConstExpr(var value) -> value;
        case VarExpr(var name) -> values.get(name);
        case AdditionExpr(var left, var right) -> 
                 left.evaluate(values) + right.evaluate(values);
        case MultiplicationExpr(var left, var right) -> 
                 left.evaluate(values) * right.evaluate(values);
    };
}

Here, we use pattern matching for switch expression to check what type of expression it is and depending on the type of the expression, we evaluate it differently. So if the expression is a constant expression, we already know its value, and we can simply return it. If it is a variable expression, we can retrieve the value from the map being passed in the argument and return it. If it is an addition expression, we recursively evaluate the left expression and the right expression and then simply add the two values to get the result and return it. The same is the case for multiplication expression, with the only difference that we return the multiplication of the results of left and right expressions.

Now using this, let’s see how we can evaluate the two example expressions. Before that, let’s add some methods in Expression to make creating expressions easier:

static Expression constant(int value) {
    return new ConstExpr(value);
}
static Expression varExpr(String name) {
    return new VarExpr(name);
}
default Expression plus(Expression other) {
    return new AdditionExpr(this, other);
}
default Expression times(Expression other) {
    return new MultiplicationExpr(this, other);
}

The first expression can be evaluated as below (for values: x=10, y=15 and z=20):

// 4x + 3y + 2z + 1
static void main() {
    var expr1 =
            constant(4).times(varExpr("x"))
            .plus(
                constant(3).times(varExpr("y"))
            ).plus(
                constant(2).times(varExpr("z"))
            ).plus(
                constant(1)
    );

    var values = Map.of("x", 10, "y", 15, "z", 20);
    System.out.println(expr1.evaluate(values));
}

// Output:
// 126

And the second one can be evaluated as this (for values: a=10, b=15 and c=20):

// 23ab + 34bc + 45ac
static void main() {
    var expr2 =
            constant(23).times(varExpr("a")).times(varExpr("b"))
            .plus(
                 constant(34).times(varExpr("b")).times(varExpr("c"))
            ).plus(
                 constant(45).times(varExpr("a")).times(varExpr("c"))
            );

    var values = Map.of("a", 10, "b", 15, "c", 20);
    System.out.println(expr2.evaluate(values));
}

// Output
// 22650

Data Oriented Programming

If you analyse the implementation of the example above, what we notice is that we have divided the implementation into two parts: definition of the data-model (the Expression type) and implementation of the behaviour (the evaluate() method).

To define the data-model, we used Algebraic Data Types, viz., records and sealed-types. This ensures that only valid states are allowed to exist. To define the behaviour we used powerful pattern-matching features. This makes code easy to read and easy to understand. This pattern is also known as Data Oriented Programming (DOP) [8] [9].

DOP is based on following principles:

  • Model data immutably and transparently.
  • Model the data, the whole data, and nothing but the data.
  • Make illegal states unrepresentable.
  • Separate operations from data.

If we follow these principles, we can utilize Data Oriented Programming to greatly increase the readability of Java programs.

References

[1] https://openjdk.org/projects/amber/
[2] https://openjdk.org/jeps/395
[3] https://openjdk.org/jeps/409
[4] https://openjdk.org/jeps/394
[5] https://openjdk.org/jeps/440
[6] https://openjdk.org/jeps/441
[7] https://openjdk.org/jeps/456
[8] https://www.infoq.com/articles/data-oriented-programming-java/
[9] https://inside.java/2024/05/23/dop-v1-1-introduction/

Total
0
Shares
Previous Post

Why AI Agents Need a Protocol-Flexible Event Bus

Next Post

JAVAPRO Magazine JCON & Java 26 Special Edition – Call for Papers

Related Posts

Revitalizing Legacy Code

Java has been the backbone of web and enterprise applications for 30 years, powering everything from banking systems to large-scale logistics platforms. Not only is the technology still widely used, but some of the earliest enterprise Java applications developed in the 1990s and early 2000s are still running, playing a key role in business operations. So, how can developers bring these essential 30-year-old enterprise Java applications into the future without disrupting critical business functions? 
Civardi Chiara
Read More