When a class file from another project of my own was decompiled by IntelliJ (by Fernflower de-compiler), I was marveled at the closeness of the decompiled code compared to the source code, even the method local variable names are the same as the original source code.
I don't know anything about how the Java compilation process works and how the JVM works, my naive understanding is that the names of the public stuff may need to be kept after compilation, but names of local variables, they are simply mnemonics to facilitate human reading, totally useless outside of their scope, and I don’t think the JVM needs this information.
So, is this information simply figured out by the de-compiler through some magic or does the compiled class retain a lot of information and what for?
In the end, it depends on the actual compiler and the exact compilation settings.
As you noted, the JVM itself does not need any local variable names. (Strictly speaking, it doesn't really need method names either. It is even possible to have two methods with the same name and arguments that only differ in the return type, but I'd have to look up some more details about this in the spec to say something more profoundly). But the class file can contain additional debug information that goes beyond the information that is required by the JVM.
The standard Java compiler is javac. And the documentation already contains some hints about the possible debug information:
-g
Generates all debugging information, including local variables. By default, only line number and source file information is generated.
-g:none
Does not generate any debugging information.
-g:[keyword list]
Generates only some kinds of debugging information, specified by a comma separated list of keywords. Valid keywords are:
- source : Source file debugging information.
- lines: Line number debugging information.
- vars: Local variable debugging information.
One can try this out with an example:
public class ExampleClass {
public static void main(String[] args) {
ExampleClass exampleClass = new ExampleClass();
exampleClass.exampleMethod();
}
public void exampleMethod() {
String string = "This is an example";
for (int counter = 0; counter < 10; counter++) {
String localResult = string + counter;
System.out.println(localResult);
}
}
}
Compiling this with
javac ExampleClass.java -g:none
will generate a class file. Printing information about this class file with
javap -c -v -l ExampleClass.class
(where -c
means to disassemble the output, -v
means that the output should be verbose, and -l
means that the line number information should be printed), the output is as follows:
public class ExampleClass
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #13.#22 // java/lang/Object."<init>":()V
#2 = Class #23 // ExampleClass
#3 = Methodref #2.#22 // ExampleClass."<init>":()V
#4 = Methodref #2.#24 // ExampleClass.exampleMethod:()V
#5 = String #25 // This is an example
#6 = Class #26 // java/lang/StringBuilder
#7 = Methodref #6.#22 // java/lang/StringBuilder."<init>":()V
#8 = Methodref #6.#27 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#9 = Methodref #6.#28 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#10 = Methodref #6.#29 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#11 = Fieldref #30.#31 // java/lang/System.out:Ljava/io/PrintStream;
#12 = Methodref #32.#33 // java/io/PrintStream.println:(Ljava/lang/String;)V
#13 = Class #34 // java/lang/Object
#14 = Utf8 <init>
#15 = Utf8 ()V
#16 = Utf8 Code
#17 = Utf8 main
#18 = Utf8 ([Ljava/lang/String;)V
#19 = Utf8 exampleMethod
#20 = Utf8 StackMapTable
#21 = Class #35 // java/lang/String
#22 = NameAndType #14:#15 // "<init>":()V
#23 = Utf8 ExampleClass
#24 = NameAndType #19:#15 // exampleMethod:()V
#25 = Utf8 This is an example
#26 = Utf8 java/lang/StringBuilder
#27 = NameAndType #36:#37 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#28 = NameAndType #36:#38 // append:(I)Ljava/lang/StringBuilder;
#29 = NameAndType #39:#40 // toString:()Ljava/lang/String;
#30 = Class #41 // java/lang/System
#31 = NameAndType #42:#43 // out:Ljava/io/PrintStream;
#32 = Class #44 // java/io/PrintStream
#33 = NameAndType #45:#46 // println:(Ljava/lang/String;)V
#34 = Utf8 java/lang/Object
#35 = Utf8 java/lang/String
#36 = Utf8 append
#37 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#38 = Utf8 (I)Ljava/lang/StringBuilder;
#39 = Utf8 toString
#40 = Utf8 ()Ljava/lang/String;
#41 = Utf8 java/lang/System
#42 = Utf8 out
#43 = Utf8 Ljava/io/PrintStream;
#44 = Utf8 java/io/PrintStream
#45 = Utf8 println
#46 = Utf8 (Ljava/lang/String;)V
{
public ExampleClass();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: new #2 // class ExampleClass
3: dup
4: invokespecial #3 // Method "<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4 // Method exampleMethod:()V
12: return
public void exampleMethod();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=2, locals=4, args_size=1
0: ldc #5 // String This is an example
2: astore_1
3: iconst_0
4: istore_2
5: iload_2
6: bipush 10
8: if_icmpge 43
11: new #6 // class java/lang/StringBuilder
14: dup
15: invokespecial #7 // Method java/lang/StringBuilder."<init>":()V
18: aload_1
19: invokevirtual #8 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
22: iload_2
23: invokevirtual #9 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
26: invokevirtual #10 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
29: astore_3
30: getstatic #11 // Field java/lang/System.out:Ljava/io/PrintStream;
33: aload_3
34: invokevirtual #12 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
37: iinc 2, 1
40: goto 5
43: return
StackMapTable: number_of_entries = 2
frame_type = 253 /* append */
offset_delta = 5
locals = [ class java/lang/String, int ]
frame_type = 250 /* chop */
offset_delta = 37
}
That's quite a lot of information, but nothing beyond the actual structure of the class itself.
(You mentioned that the names of "public stuff" has to be kept. But the name of "private stuff" also has to be kept - at the very least, for reflection. With methods like Class#getDeclaredFields
, you can still access private fields, for example - so the name must be available somewhere).
Now, the opposite is to compile it with
javac ExampleClass.java -g
to retain all debugging information. Printing the result as described above yields
public class ExampleClass
minor version: 0
major version: 52
flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
#1 = Methodref #13.#36 // java/lang/Object."<init>":()V
#2 = Class #37 // ExampleClass
#3 = Methodref #2.#36 // ExampleClass."<init>":()V
#4 = Methodref #2.#38 // ExampleClass.exampleMethod:()V
#5 = String #39 // This is an example
#6 = Class #40 // java/lang/StringBuilder
#7 = Methodref #6.#36 // java/lang/StringBuilder."<init>":()V
#8 = Methodref #6.#41 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#9 = Methodref #6.#42 // java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
#10 = Methodref #6.#43 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#11 = Fieldref #44.#45 // java/lang/System.out:Ljava/io/PrintStream;
#12 = Methodref #46.#47 // java/io/PrintStream.println:(Ljava/lang/String;)V
#13 = Class #48 // java/lang/Object
#14 = Utf8 <init>
#15 = Utf8 ()V
#16 = Utf8 Code
#17 = Utf8 LineNumberTable
#18 = Utf8 LocalVariableTable
#19 = Utf8 this
#20 = Utf8 LExampleClass;
#21 = Utf8 main
#22 = Utf8 ([Ljava/lang/String;)V
#23 = Utf8 args
#24 = Utf8 [Ljava/lang/String;
#25 = Utf8 exampleClass
#26 = Utf8 exampleMethod
#27 = Utf8 localResult
#28 = Utf8 Ljava/lang/String;
#29 = Utf8 counter
#30 = Utf8 I
#31 = Utf8 string
#32 = Utf8 StackMapTable
#33 = Class #49 // java/lang/String
#34 = Utf8 SourceFile
#35 = Utf8 ExampleClass.java
#36 = NameAndType #14:#15 // "<init>":()V
#37 = Utf8 ExampleClass
#38 = NameAndType #26:#15 // exampleMethod:()V
#39 = Utf8 This is an example
#40 = Utf8 java/lang/StringBuilder
#41 = NameAndType #50:#51 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#42 = NameAndType #50:#52 // append:(I)Ljava/lang/StringBuilder;
#43 = NameAndType #53:#54 // toString:()Ljava/lang/String;
#44 = Class #55 // java/lang/System
#45 = NameAndType #56:#57 // out:Ljava/io/PrintStream;
#46 = Class #58 // java/io/PrintStream
#47 = NameAndType #59:#60 // println:(Ljava/lang/String;)V
#48 = Utf8 java/lang/Object
#49 = Utf8 java/lang/String
#50 = Utf8 append
#51 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#52 = Utf8 (I)Ljava/lang/StringBuilder;
#53 = Utf8 toString
#54 = Utf8 ()Ljava/lang/String;
#55 = Utf8 java/lang/System
#56 = Utf8 out
#57 = Utf8 Ljava/io/PrintStream;
#58 = Utf8 java/io/PrintStream
#59 = Utf8 println
#60 = Utf8 (Ljava/lang/String;)V
{
public ExampleClass();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
LocalVariableTable:
Start Length Slot Name Signature
0 5 0 this LExampleClass;
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1
0: new #2 // class ExampleClass
3: dup
4: invokespecial #3 // Method "<init>":()V
7: astore_1
8: aload_1
9: invokevirtual #4 // Method exampleMethod:()V
12: return
LineNumberTable:
line 4: 0
line 5: 8
line 6: 12
LocalVariableTable:
Start Length Slot Name Signature
0 13 0 args [Ljava/lang/String;
8 5 1 exampleClass LExampleClass;
public void exampleMethod();
descriptor: ()V
flags: ACC_PUBLIC
Code:
stack=2, locals=4, args_size=1
0: ldc #5 // String This is an example
2: astore_1
3: iconst_0
4: istore_2
5: iload_2
6: bipush 10
8: if_icmpge 43
11: new #6 // class java/lang/StringBuilder
14: dup
15: invokespecial #7 // Method java/lang/StringBuilder."<init>":()V
18: aload_1
19: invokevirtual #8 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
22: iload_2
23: invokevirtual #9 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder;
26: invokevirtual #10 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
29: astore_3
30: getstatic #11 // Field java/lang/System.out:Ljava/io/PrintStream;
33: aload_3
34: invokevirtual #12 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
37: iinc 2, 1
40: goto 5
43: return
LineNumberTable:
line 9: 0
line 10: 3
line 11: 11
line 12: 30
line 10: 37
line 14: 43
LocalVariableTable:
Start Length Slot Name Signature
30 7 3 localResult Ljava/lang/String;
5 38 2 counter I
0 44 0 this LExampleClass;
3 41 1 string Ljava/lang/String;
StackMapTable: number_of_entries = 2
frame_type = 253 /* append */
offset_delta = 5
locals = [ class java/lang/String, int ]
frame_type = 250 /* chop */
offset_delta = 37
}
SourceFile: "ExampleClass.java"
The main differences are
LineNumberTable
and a LocalVariableTable
. For example, consider the exampleMethod()
:
LineNumberTable:
line 9: 0
line 10: 3
line 11: 11
line 12: 30
line 10: 37
line 14: 43
LocalVariableTable:
Start Length Slot Name Signature
30 7 3 localResult Ljava/lang/String;
5 38 2 counter I
0 44 0 this LExampleClass;
3 41 1 string Ljava/lang/String;
The details about the structure of these attributes are given in the documentation of the LineNumberTable
and the LocalVariableTable
.
For the LineNumberTable
, it says
It may be used by debuggers to determine which part of the code array corresponds to a given line number in the original source file.
For the LocalVariableTable
, it says
It may be used by debuggers to determine the value of a given local variable during the execution of a method.
In the output of javap
, the names of the local variables are already resolved. However, the actual information that is contained in the table itself is only an index into the constant pool (that's why it has more entries when debugging information is retained). For example, the entry for the localResult
variable is shown as
30 7 3 localResult Ljava/lang/String;
although it actually only contains a reference to the entry
#27 = Utf8 localResult
of the constant pool.
So, are these information simply figured out by the de-compiler through some magic or dose the compiled class retain a lot of information and what for?
As shown above, the compiled class can retain a lot of information. After all, one of the main purposes of an IDE is to provide a nice, visual interface to a debugger. And therefore, most compilers that are in one way or the other triggered by an IDE will by default try to retain as much debug information as possible.