I'm new to Java and programming in general. I'm currently studying how Java handles variables' memorization and scope. What I've understood is that:
local variables (i.e. variables declared inside methods) are memorized in the stack of the respective thread, inside the stack frame of the method where they are declared. Once the method ends its execution, the respective stack frame is removed from the stack and consequently the local variables cease to exists
the scope of the local variables goes from the point where they are declared to the end of the block in which they are contained, including any nested block. Therefore it is possible to declare two homonymous variables inside the same method if the scope of the two doesn't overlap. So it is legal to write something like this:
public void myMethod(){
String[] myArray = new String[2];
for(int i=0; i< myArray.length; i++){
String message = "hello";
System.out.println(message);
}
if(true){
int number =3;
System.out.println(number);
}
String message = "Ciao";
int number =-50;
System.out.println(message);
System.out.println(number);
}
What I would like to understand is how Java handles memory with regards to homonymous variables existing inside the same method but without overlapping scopes. As I've said earlier, I've learnt that local variables are stored inside the method's stack frame until the method ends, but if it's really like so, how can two variables with the same name be stored inside the same stack frame? Do they really both live in the method's stack frame until the method ends or is it that once the scope of a local variable definitely ends, that variable is removed from the stack frame even though the method hasn't finished yet, therefore allowing the creation of another local variable with the same name?
Local variables don't exist. At all.
java is a multi-step process. First, a compiler (javac.exe
) turns .java
files into .class
files. Then a runtime (java.exe
) runs the class file.
During that first step (compilation), which is very formulaic (a spec spells out precisely how javac
works. In contrast to C compilers which is allowed to, for example, do a deep analysis of your code, determine a loop has no side-effects whatsoever, and just completely eliminate it from the executable that gcc
produces), local variables are lost.
Specifically, at the class file format, the system uses something very different - the stack, and slots.
Any method declares how many 'slots' it needs. The way java's specification works is: The spec decrees what should happen and which guarantees must be provided. It never spells out how things are done. But, sometimes explaining one particular 'how' is simpler than trying to delve into the spec. Know that this describes how most JVMs do it - but a JVM implementation doesn't have to:
Bytecode does not refer to 'local variables' because these do not exist in bytecode. Instead, bytecode can:
POP
, which removes and discards the top of the stack, or FADD
, which pops 2 values off of the stack, the JVM explodes if they aren't floats (so, we can assume they are), adds the two floats, and puts that back onto the stack. (NB: The verifier checks that the 'explodes' situation cannot occur. It sounds more dramatic than it is).ALOAD_1
which is a simple bytecode instruction that fetches an object reference from 'slot #1' (that'd be the second slot - the first slot is slot #0), and pushes it onto the stack.Thus, this java code:
int a = readInt();
int b = readInt();
println(a + b);
might be compiled as:
[START METHOD]
[META: SLOT SIZE: 2]
// int a = readInt(); - 'a' translates to slot 0.
INVOKESTATIC com.foo.KeyboardInput readInt()I;
ISTORE_0
// int b = readInt(); - 'b' translates to slot 1.
INVOKESTATIC com.foo.KeyboardInput readInt()I
ISTORE_1
ILOAD_0 // load int value in slot 0 and push onto stack.
ILOAD_1 // load int value in slot 1 and push onto stack.
IADD // pop 2 int values, add them, push it back
INVOKESTATIC com.foo.BasicOutput println(I)V
Where println
consumes 1 int off the stack.
Let's now 'fancy up' our method and introduce another local var:
int a = readInt();
int b = readInt();
int c = a + b;
println(c);
This code is easily identifiable as entirely identical in operation to the first. And if you compile this, it would in fact produce the exact same bytecode. No 'third slot' would appear. Because there's no need - the compiler doesn't translate a local variable 'one-to-one' to a slot. As long as you don't use c
anywhere else, javac
is perfectly capable of realizing that there is no need to have it exist as a slot.
In fact, if you were to compile the above code, you'd end up with.. no slots whatsoever! After all, if that is the entire method, and a
/b
/c
aren't used anywhere else, the compiler would produce:
[START METHOD]
[META: SLOT SIZE: 2]
// readInt(); - just leave the read value on the stack.
INVOKESTATIC com.foo.KeyboardInput readInt()I;
// read a second value from the stack
INVOKESTATIC com.foo.KeyboardInput readInt()I
IADD // pop 2 int values, add them, push it back
INVOKESTATIC com.foo.BasicOutput println(I)V
No need for even a single slot.
Similarly then, imagine this method:
void example() {
int a = ....;
int b = ....;
// Tons and tons of math with a and b.
println(a + b);
// from here on out, a and b are never used again.
int c = ....;
int d = ....;
// Tons and tons of math with c and d.
// note that a and b are not used in this code
}
The compiler would only use at most 2 slots. Because 'c' and 'd' just g where 'a' and 'b' used to be. The compiler is perfectly capable of concluding that 'a' and 'b', as they aren't being used any more, can be 'overwritten'.
Hence, there is simply no one-to-one mapping of local variables to slots:
this
reference in instance methods.With all that context, your question has become meaningless: Local variables don't exist in class files, therefore, 'how does the JVM deal with 2 locals with the same name' is trivial.