Why does the bytecode generated by Java include operations of pushing onto the stack and immediately popping off the stack?

62 views Asked by At

Here is a example

public Integer add(Integer a) {
    a++;
    return a;
}

The corresponding bytecode instructions for this method are as follows

 0 aload_1
 1 astore_2
 2 aload_1
 3 invokevirtual #62 <java/lang/Integer.intValue : ()I>
 6 iconst_1
 7 iadd
 8 invokestatic #52 <java/lang/Integer.valueOf : (I)Ljava/lang/Integer;>
11 astore_1
12 aload_2
13 pop
14 aload_1
15 areturn

In lines 12-13, you can observe that it performs a push onto the stack followed by an immediate pop operation.

I want to know why design it that way and why it needs another copy of "a" into LocalVariableTable

3

There are 3 answers

0
Valerij Dobler On

Because a++ is a shorthand notation for a = a + 1 and you reassign the variable before you return it. Take a look at the following code in the bytecode:

public Integer add2(Integer a) {
    return a+1;
}
0
Holger On

Using the ++ operator on an Integer reference type, is a rather uncommon case. Before Java 5’s introduction of auto-boxing, it wasn’t even possible.

It’s not unusual that software handles uncommon cases in a general way, rather than optimized.

You have an expression performing a side effect (incrementing a variable) but evaluating to the previous value as result. A general way to implement such an expression is to remember the old value in an additional variable before performing the side effect, followed by reading the variable.

Then, you’re using this expression as a statement. A general way of performing an expression as a statement, is to generate the expression code, followed by code to drop the result.

So the unnecessary code code be explained by code following a general strategy rather than creating compact code for this specific scenario. As said, it’s a rather uncommon scenario.

Note that the more common

public long add(long a) {
    a++;
    return a;
}

gets compiled to a more straight-forward

public long add(long);
    Code:
       0: lload_1
       1: lconst_1
       2: ladd
       3: lstore_1
       4: lload_1
       5: lreturn

and the even more common

public int add(int a) {
    a++;
    return a;
}

will be compiled to

public int add(int);
    Code:
       0: iinc          1, 1
       3: iload_1
       4: ireturn

But note that this behavior is compiler specific. Whether general handling of such a scenario leads to unnecessary code depends on the internal design of the compiler. It’s also the compiler vendor’s decision, how much effort to spend on improving the code output for a specific scenario.

For example, when compiling your original code

public Integer add(Integer a) {
    a++;
    return a;
}

with Eclipse’s current compiler implementation, you’ll get

public java.lang.Integer add(java.lang.Integer);
    Code:
       0: aload_1
       1: invokevirtual #16   // Method java/lang/Integer.intValue:()I
       4: iconst_1
       5: iadd
       6: invokestatic  #22   // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
       9: astore_1
      10: aload_1
      11: areturn
0
raner On

Most Java compilers perform little to no optimization on the generated bytecode, because optimizations can be done much more efficiently by the just-in-time compiler (and they can take into account dynamic information that was gathered at runtime).

If you look long enough at compiler-generated bytecode you will find many more similar scenarios where the compiler produces two consecutive opcodes that cancel each other out, or situations where two or three consecutive opcodes could be replaced by a single, more specialized, opcode. The bottom line is that optimizing such scenarios has relatively little impact compared to what can be done at runtime.