Overview
Java, the grandfather of JVM languages, continues to evolve and is becoming richer in features with each update, thus steadily improving the productivity of developers. In parallel with this evolution, other languages have popped up with their own sets of features (and often with more freedom to experiment than Java). Some of these features have made their way into Java, others will be added eventually, and others may continue to be debated for quite some time.
This article is written from the perspective of a Java programmer (and teacher) who also writes plenty of Scala (and a few other languages). Through code examples, it discusses a few of the features I miss the most when returning to Java, namely inferred semicolons, tuples, named arguments, local functions, tail-recursion optimization, type aliases, opaque types, and variance annotations.
Inferred Semicolons
String dots() {
int v = calculate()
if (v < 0) v = 0
return ".".repeat(v)
}
Is that less legible than the proper Java alternative? Of course, adding the missing semicolons explicitly is only a matter of a few keystrokes. Not a big deal. I only include this missing feature because it is the first that bites me every time I switch back to Java after having programmed in one of the many languages in which the compiler can infer (most) semicolons.
Tuples
Consider a utility function timeIt that evaluates an expression and returns its value alongside the time it took (in seconds) to run the code:
static <A> ??? timeIt(Supplier<? extends A> code) { ... }
// used as:
var timedResult = timeIt(() -> /* some computation */);
Conceptually, the value returned by function timeIt is an aggregate of types A and double, for which there is no standard Java type. This leaves developers with several unattractive alternatives:
- introduce a library dependency (say, Apache Commons) and use
Pair<A, Double>; - define their own umpteenth implementation of a
Pair<A, B>type; or - create a custom record (say,
TimedPair<A>) specifically linked to thetimeItfunction.
Furthermore, the code that uses timeIt may continue with something like:
if (timedResult.getRight() > 1.0) { /* use timedResult.getLeft() */ }
where getRight and getLeft (or some other names) are used to extract parts of the pair. A Java that integrates pairs, triples or general tuples at the language level would result in better code, maybe something like:
var (value, time) = timeIt(() -> /* some computation */);
if (time > 1.0) { /* use value */ }
Named Arguments
Many languages allow (or require) function calls to specify the names of arguments in addition to their values. This can improve code legibility tremendously, especially with Boolean arguments or when multiple arguments have the same type. Contrast, for instance, current Java code:
String letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char[] array = letters.toCharArray();
String s1 = letters.substring(7, 12); // is 12 a length or an index?
String s2 = String.valueOf(array, 7, 5); // is 5 a length or an index?
Writer w = new PrintWriter(out, true); // what is true?
canvas.setBounds(10, 20, 30, 40); // which is which?
with what Java could be:
String s1 = letters.substring(7, endIndex = 12);
String s2 = String.valueOf(array, 7, count = 5);
Writer w = new PrintWriter(out, autoFlush = true);
canvas.setBounds(x = 10, y = 20, width = 30, height = 40);
It is true that modern Integrated Development Environments typically inlay argument names, making the code much more easier to read. Still, those aids might disappear when viewing code outside an IDE, e.g., on GitHub.
Local Functions
In contrast to more recent languages, Java does not allow code to define functions locally inside methods. Instead, one typically introduces additional, private methods in the current class, as in the following illustration:
List<String> updateStrings(List<String> list,
Set<String> set, String prefix, String suffix) {
return Stream.concat(
list.stream().map(str -> updateOne(str, prefix, suffix)),
set.stream().map(str -> updateOne(str, prefix, suffix))
).toList();
}
private String updateOne(String str, String prefix, String suffix) {
return prefix + str + suffix;
}
I see two main drawbacks to this approach:
- the helper function (here,
updateOne) occupies a larger scope (the enclosing class) than is needed (theupdateStringsmethod would be enough); - arguments of the method (here,
prefixandsuffix) need to be explicitly forwarded to the function.
On larger, more realistic examples, these can lead to many methods added to a class and to lengthy lists of arguments. A possible alternative is to define a local function by using a lambda expression:
List<String> updateStrings(List<String> list,
Set<String> set, String prefix, String suffix) {
Function<String, String> updateOne = str -> {
return prefix + str + suffix;
};
return Stream.concat(
list.stream().map(updateOne), set.stream().map(updateOne)
).toList();
}
This alleviates the disadvantages of the private method approach, but the syntax is unfriendly. It would be better if updateOne could be defined directly as a method, i.e.:
List<String> updateStrings(List<String> list,
Set<String> set, String prefix, String suffix) {
String updateOne(String str) {
return prefix + str + suffix;
}
return Stream.concat(
list.stream().map(updateOne), set.stream().map(updateOne)
).toList();
}
Tail Recursion Optimization
Consider the problem of finding in an iterator of strings two consecutive values that are equal. This can be achieved by either variant of findDoublet below:
// using a loop
Optional<String> findDoublet(Iterator<String> words) {
if (!words.hasNext()) return Optional.empty();
String first = words.next();
while (words.hasNext()) {
String second = words.next();
if (first.equals(second)) return Optional.of(first);
first = second;
}
return Optional.empty();
}
// using recursion
Optional<String> findDoublet(Iterator<String> words) {
return words.hasNext() ? findNext(words, words.next()) : Optional.empty();
}
private Optional<String> findNext(Iterator<String> words, String first) {
if (!words.hasNext()) return Optional.empty();
String second = words.next();
return first.equals(second) ? Optional.of(first) : findNext(words, second);
}
In some languages, such as Scala or Kotlin, the two variants would be compiled into (roughly) the same bytecode: The compiler would replace the recursive call to findNext with a reassignment to variable first and a loop.
Arguably, there is no compelling reason to solve the doublet problem recursively instead of by using a loop. However, there are cases where the recursive code is less unwieldy than the loop-based equivalent. This happens to me most often when dealing with the need to retry a computation, as in lock-free concurrent algorithms. For instance, method addAndGet in AtomicInteger is currently implemented as:
public final int getAndAddInt(Object o, long offset, int delta) {
int v;
do {
v = getIntVolatile(o, offset);
} while (!weakCompareAndSetInt(o, offset, v, v + delta));
return v;
}
A recursive variant would avoid the negation in the test make the retry more explicit:
public final int getAndAddInt(Object o, long offset, int delta) {
int v = getIntVolatile(o, offset);
if (weakCompareAndSetInt(o, offset, v, v + delta)) return v;
else return getAndAddInt(Object o, long offset, int delta);
}
In general, it would be nice to have the freedom to choose a tail recursive function over a loop and not have to worry about performance and stack execution growth.
Type Aliases
The lack of type aliases has been a regular complaint since at least the introduction of generics in Java. Suppose, for instance, that an application centers on the concept of mapping keys to lists of files. Instead, of repeating the type Map<K, List<File>> all over the code, it would be convenient to introduce a custom name such as FileTable<K>. Some languages let you do exactly that:
// This Kotlin
typealias FileTable<K> = Map<K, List<File>>
// This is Scala
type FileTable[K] = Map[K, List[File]]
// This is Rust
type FileTable<K> = HashMap<K, Vec<File>>
Java has no such mechanism yet. The introduction of type inference has somewhat improved the situation:
var table = new HashMap<K, List<File>>();
// or
HashMap<K, List<File>> table = new HashMap<>();
are an improvement over the old:
Map<K, List<File>> table = new HashMap<K, List<File>>();
Still, it would be nice to be able to replace something like:
Map<K, List<File>> combine(Map<K, List<File>> table1,
Map<K, List<File>> table2)
with the friendlier:
FileTable<K> combine(FileTable<K> table1, FileTable<K> table2)
Opaque Types
More important maybe than type aliases is the notion of opaque types. While Map<K, List<File>> and FileTable<K> would be interchangeable, opaque types lets you defines types that are considered distinct at compile-time while retaining the same representation at runtime. For instance, the Scala syntax:
opaque type Length = Double
introduces a new type Length, distinct from Double during compilation, but erased and replaced with Double in the generated code. The beauty of this is that code that expects a Length argument cannot be compiled when called on a Double and vice-versa. Most importantly, this type-safety comes with no runtime cost. No Length wrapper is used and all Length values are simple double in the generated code. (Scala’s Double is compiled into Java’s double on the JVM.)
This is especially useful with methods that rely on several arguments of the same type:
Length expand(Length len, Temperature temp)
prevents argument flipping and thus is safer than:
double expand(double len, double temp)
but in current Java, this safety can only be achieved at the cost of wrapper objects at runtime. (Presumably, Project Valhalla will soon address this Java weakness.)
Variance Annotations
My last missing feature is a big one and requires a longer explanation. Recall that Java generics are non-variant: List<Integer> is not a subtype of List<Number> even though Integer is a subtype of Number. There is a good reason for that: List<Integer> does not support all the operations of List<Number> as a proper subtype should. For instance, you can add a Double object to a List<Number> but not to a List<Integer>.
As a result, a summing method defined as:
double sum(List<Number> nums)
cannot be invoked on a List<Integer> value. Instead, one needs to write:
double sum(List<? extends Number> nums)
This now works because List<Integer> is a subtype of List<? extends Number>.
However, the use of type bounds such as ? extends Number tends to complicate the signatures of Java methods. For instance, method or in the standard class Optional<T> has the following signature:
public Optional<T> or(Supplier<? extends Optional<? extends T>> supplier)
instead of the simpler:
public Optional<T> or(Supplier<Optional<T>> supplier)
This is because Optional and Supplier, like List, are non-variant. Yet, contrary to List, there would be little harm in having Optional<Integer> be a subtype of Optional<Number> and Supplier<Integer> be a subtype of Supplier<Number>.
This is exactly what languages such as Scala, Kotlin, or C# do. In Scala, Option[Integer] is a subtype of Option[Number] and in Kotlin, Result<Integer> is a subtype of Result<Number>. This is because Option is defined in Scala as Option[+T] and Result is defined in Kotlin as Result<out T>. The added syntax “+” and “out” makes Option and Result covariant: if S is a subtype of T, then Option[S] is a subtype of Option[T] and Result<S> is a subtype of Result<T>. By contrast, non-variant types, such as Set in Scala or Array in Kotlin are defined without a “+” or “out” annotation.
Developers can use covariance (and its dual contra-variance) to simplify method signatures. Consider, for instance, a method that invokes a series of independent tasks on the same input (say, in parallel). In Scala, you can define it as:
def run[I, O](input: I, tasks: Iterable[I => O]): List[O] = ...
where I and O are the input and output types, respectively, and I => O is a function from I to O (input of type I and output of type O). To achieve the same flexibility as this Scala code, a Java method would have to use the following signature:
<I, O> List<O> run(I input,
Iterable<? extends Function<? super I, ? extends O>> tasks)
None of the type bounds used in the Java variant are needed in Scala because Iterable is covariant and functions such as I => O are covariant in their outputs and contra-variant in their inputs.
Conclusion
As a language, Java has had—and continues to have—remarkable success and longevity. This requires, on the part of its designers, a careful balance between stability and innovation. Brian Goetz often remarks that he gets hit on both sides, for moving too slowly and too recklessly!
This article is an overview of some features I wish Java had because I enjoy them in other languages. All the missing features discussed here exist in both Scala and Kotlin, for instance. There are more features from other languages that would make a great addition to Java, such as extensions, type classes, more type inference, or a better integration of primitive types in the type system, but what I have illustrated here constitutes the top of my list. I suspect that some of my favorites will make it into Java soon, some might take longer, and others will never be part of the language (and maybe cannot without Java ceasing to be Java).
In the meantime, I am delighted that we live in a healthy landscape, where the “big guy”, Java, continues to evolve and smaller players keep experimenting with possible features and evolutions. It’s an exciting time for programming languages!