I'm confused about Java's String.intern() method documentation. The official documentation states:
Returns a canonical representation for the string object. A pool of strings, initially empty, is maintained privately by the class String. When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
But when we create a String using new, isn't the string literal already added to the String Pool? For example:
String s1 = new String("Java"); // Creates 2 objects
So when exactly does intern() actually add a new string to the pool? The documentation seems misleading, or am I missing something?
After carefully reading the answers and comments, I think I now understand exactly how String.intern()
works. Let me explain with this code example:
public static void main(String[] args) {
// Creating strings from bytes helps demonstrate that these strings
// are definitely NOT in the string pool initially
String f1 = new String(new byte[] {0x68, 0x69}); // creates "hi" in heap only
String f2 = new String(new byte[] {0x68, 0x69}); // creates another "hi" in heap only
System.out.println("Equals: "+f1.equals(f2)); // true - same content
String i1 = f1.intern();
System.out.println("ref check for 1: "+(i1 == f1)); // true
String i2 = f2.intern();
System.out.println("ref check for 2: "+(i2 == f2)); // false
System.out.println("ref check both: "+(i1 == i2)); // true
// The crucial test:
String str = "hi"; // String literals are always interned
System.out.println("check f1 == hi: "+(f1 == str)); // true
}
What I learned:
When f1.intern()
is called and "hi" isn't in the pool yet, what happens is: the actual String
object referenced by f1
becomes the pooled instance. This is proven by the fact that i1 == f1
and "hi" == f1
return true
- they're the exact same object.
When f2.intern()
is called, since "hi" is already in the pool (it's f1
's object), it returns that pooled instance. That's why:
i2 == f2
is false
(different objects)i1 == i2
is true
(both point to the pooled instance)The final test with String str = "hi"
confirms this understanding. Since string literals are always interned, str
must reference the pooled instance of "hi". The fact that f1 == str
returns true
proves that f1
's object really did become the pooled instance.
This helped me understand what the documentation means by "this String object is added to the pool and a reference to this String object is returned" - it's literally using the original string object as the pooled instance rather than creating a copy.
Using the byte array constructor (as shown in Progman's answer) made this especially clear because it guaranteed that the strings were initially only in the heap, not the pool, allowing us to see exactly what intern()
does with the first and subsequent calls.