Will a Comparator.comparing… in a `compareTo` method be optimized by the compiler in modern Java?

113 views Asked by At

When implementing the compareTo method required by Comparable interface, I want to use a Comparator generated by Comparator.comparing… convenience method.

For example:

package work.basil.example.threading.multitasking;

import java.util.Comparator;

public record Job(
        int id ,
        int amt
) implements Comparable < Job >
{
    @Override
    public int compareTo ( final Job job )
    {
        return Comparator.comparingInt( Job :: id ).thenComparingInt( Job :: amt ).compare( this , job );
    }
}

Will the call to Comparator.comparingInt( … ).thenComparingInt( … ) instantiate a Comparator on each and every execution of this method? Or will either a Java bytecode compiler (javac, etc.) or a JIT compiler (HotSpot, OpenJ9, etc.) optimize for that?

If the compiler does not optimize, I would make a private static final constant, like this:

package work.basil.example.threading.multitasking;

import java.util.Comparator;

public record Job(
        int id ,
        int amt
) implements Comparable < Job >
{
    private static final Comparator < Job > COMPARATOR = Comparator.comparingInt( Job :: id ).thenComparingInt( Job :: amt );

    @Override
    public int compareTo ( final Job job )
    {
        return Job.COMPARATOR.compare( this , job );
    }
}
1

There are 1 answers

1
rzwitserloot On BEST ANSWER

Will javac optimize this?

Yes. No. Mu. Depends on what you mean by 'optimize'.

Javac is on rails: It has to do precisely what the java spec says, no more, and no less, and the java spec includes very little optimization. The java spec pretty much spells out 'this java code turns into that bytecode, exactly, and you aren't allowed to deviate or you're not a java compiler'. Unlike, say, C compilers which will spend a ton of time analysing code flow and working with the configured targeted platform to introduce zany levels of optimization such as loop unrolls, rewriting ifs to follow annotated branch predictions, inlining methods, straight up eliminating whole swaths of code because that code path as per static analysis can't ever be touched (and in the #ifdef heavy C lang world, that's a very important optimization done by the compiler!)

So what does javac do here?

Loads of tutorials, and in some ways even the java spec itself, suggests that this:

list.forEach(x -> System.out.println(x.toLowerCase());

is just syntax sugar for:

list.forEach(new Consumer<String>() {
  @Override public void consume(String x) {
    System.out.println(x.toLowerCase());
  }
}

and for most intents and purposes it's a fine way to explain it to those who know what anonymous inner class literals are (that's new Type() {...} - the syntax used here), but this is incorrect specifically in the context of Basil's question.

Because the anonymous inner class literal is as per spec required to be compiled to code that instantiates an object, whereas the spec of lambdas and method refs explicitly does not require this, and indeed, goes out of its way to indicate that the 'identity' of the 'object' that a lambda expression or method ref becomes is not to be treated as relevant in any way because javac/the JVM reserves every right to make that useless.

And indeed, that is exactly what it does. Let's compile your first snippet and then run javap -c -v to have a look at what it is compiled to:

> cat Job.java
 ... [omitted parts] ...
@Override
    public int compareTo ( final Job job )
    {
        return Comparator.comparingInt( Job :: id ).thenComparingInt( Job :: amt ).compare( this , job );
    }

> javac Job.java
> javap -c -v Job
 .. [omitted _a lot_ of output] ..
 public int compareTo(Job);
    descriptor: (LJob;)I
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=3, locals=2, args_size=2
         0: invokedynamic #16,  0             // InvokeDynamic #0:applyAsInt:()Ljava/util/function/ToIntFunction;
         5: invokestatic  #20                 // InterfaceMethod java/util/Comparator.comparingInt:(Ljava/util/function/ToIntFunction;)Ljava/util/Comparator;
         8: invokedynamic #26,  0             // InvokeDynamic #1:applyAsInt:()Ljava/util/function/ToIntFunction;
        13: invokeinterface #27,  2           // InterfaceMethod java/util/Comparator.thenComparingInt:(Ljava/util/function/ToIntFunction;)Ljava/util/Comparator;
        18: aload_0
        19: aload_1
        20: invokeinterface #30,  3           // InterfaceMethod java/util/Comparator.compare:(Ljava/lang/Object;Ljava/lang/Object;)I
        25: ireturn
      LineNumberTable:
        line 5: 0

The crucial thing to notice in that output is that there isn't a single new bytecode instruction (new makes new objects) in that entire block. Contrast to replacing e.g. Job::id with the longhand form of:

new ToIntFunction<Job>() {
  @Override public int applyAsInt(Job job) {
    return job.id();
  }
}

and if we javap -c -v that we get quite different results:

stack=3, locals=2, args_size=2
         0: new           #16                 // class Job$1
         3: dup
         4: aload_0
         5: invokespecial #18                 // Method Job$1."<init>":(LJob;)V
         8: invokestatic  #21                 // InterfaceMethod java/util/Comparator.comparingInt:(Ljava/util/function/ToIntFunction;)Ljava/util/Comparator;
        11: invokedynamic #27,  0             // InvokeDynamic #0:applyAsInt:()Ljava/util/function/ToIntFunction;
        16: invokeinterface #31,  2           // InterfaceMethod java/util/Comparator.thenComparingInt:(Ljava/util/function/ToIntFunction;)Ljava/util/Comparator;
        21: aload_0
        22: aload_1
        23: invokeinterface #34,  3           // InterfaceMethod java/util/Comparator.compare:(Ljava/lang/Object;Ljava/lang/Object;)I
        28: ireturn

Note how here new is happening, class Job$1 refers to the anonymous inner class that is being created here (the new ToIntFunction<Job>() { .. }) code, its constructor is called (this constructor does nothing, and javac knows this, but as javac is on rails, it is as per spec required to insert that invokespecial #18 instruction, eventhough that constructor is a no-op... it's actually not quite a no-op, it takes as argument the Job, as it's a non-static inner class. javac knows this is not used. Again, immaterial, javac has to output all this unneeded cruft, spec requires it).

Going back to the 'nice' code without the new, all that happens is a bunch of indy invocations (indy is short for InvokeDynamic and was quite a humongous addition to the JVM/class file format). These are invoking the method '#0:applyAsInt', which is a method that takes no arguments (()), and returns an instance of ToIntFunction (that's what the Ljava/util/function/ToIntFunction; represents). This 'makes a value of type ToIntFunction, but doesn't, itself, do this work, instead it calls some nebulous thing with InDy to make this value.

So what's going on here? We need to scroll all the way to the end of javap's output for more insights:

BootstrapMethods:
  0: #77 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #65 (Ljava/lang/Object;)I
      #66 REF_invokeVirtual Job.id:()I
      #69 (LJob;)I
  1: #77 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
    Method arguments:
      #65 (Ljava/lang/Object;)I
      #70 REF_invokeVirtual Job.amt:()I
      #69 (LJob;)I
  2: #84 REF_invokeStatic java/lang/runtime/ObjectMethods.bootstrap:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/TypeDescriptor;Ljava/lang/Class;Ljava/lang/String;[Ljava/lang/invoke/MethodHandle;)Ljava/lang/Object;
    Method arguments:
      #8 Job
      #73 id;amt
      #75 REF_getField Job.id:I
      #76 REF_getField Job.amt:I
InnerClasses:
  public static final #96= #92 of #94;    // Lookup=class java/lang/invoke/MethodHandles$Lookup of class java/lang/invoke/MethodHandles

This is where the magic happens. These 'bootstrap methods' seem utterly nuts, what the heck is LambdaMetaFactory, what's MethodHandles$Lookup, why in the blazes is there a mention of an inner class, when compiling this code results in just a single class file instead of the usual Job$Inner.class file we expect to see when we wrote an actual inner class (and which, indeed, shows up if we write the anonymous inner class literal and compile it, then we get a Job$1.class)?

These are optimizations were introduced in Java8 (actually, invokedynamic as a bytecode instruction was introduced in OpenJDK7, but javac didn't generate InDy instructions until java8). Explaining exactly how it works is quite a lot and requires a complete dive into what indy does, but, to summarize for this specific question: Effectively all this hoopla does precisely what you want: It makes the 'cost' of that one-liner identical to having a separate constant.

Therefore, the conclusion is simple: No, you do NOT need to rewrite your simple snippet into that second form; do not introduce a constant, it will not be any faster. If anything, it'll be slower.

InvokeDynamic

These bootstrap methods are invoked 'as needed', but in this case, essentially only once ever, the results of them are laced to the relevant invokedynamic instruction so that the bootstrap method doesn't have to be run every time. It's as near as 'I made a constant to avoid every invocation of my compareTo method making objects and running a bunch of code' as can be, notably including that it does not create any garbage, not even short-lived garbage.

I recommend this article on oracle's java magazine about InvokeDynamic if you want to know exactly how it works. It also explains what bootstrap methods are and how InDy uses them.