javastring

I don't understand Java String.intern() Documentation?


I'm confused about Java's String.intern() method documentation. The official documentation states:

Returns a canonical representation for the string object. A pool of strings, initially empty, is maintained privately by the class String. When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

But when we create a String using new, isn't the string literal already added to the String Pool? For example:

String s1 = new String("Java");  // Creates 2 objects

So when exactly does intern() actually add a new string to the pool? The documentation seems misleading, or am I missing something?


Solution

  • After carefully reading the answers and comments, I think I now understand exactly how String.intern() works. Let me explain with this code example:

    public static void main(String[] args) {
        // Creating strings from bytes helps demonstrate that these strings 
        // are definitely NOT in the string pool initially
        String f1 = new String(new byte[] {0x68, 0x69});  // creates "hi" in heap only
        String f2 = new String(new byte[] {0x68, 0x69});  // creates another "hi" in heap only
        
        System.out.println("Equals: "+f1.equals(f2));  // true - same content
        
        
        String i1 = f1.intern();
        System.out.println("ref check for 1: "+(i1 == f1)); // true
        String i2 = f2.intern();
        System.out.println("ref check for 2: "+(i2 == f2)); // false
        System.out.println("ref check both: "+(i1 == i2)); // true
        
        // The crucial test:
        String str = "hi";  // String literals are always interned
        System.out.println("check f1 == hi: "+(f1 == str)); // true
    }
    

    What I learned:

    1. When f1.intern() is called and "hi" isn't in the pool yet, what happens is: the actual String object referenced by f1 becomes the pooled instance. This is proven by the fact that i1 == f1 and "hi" == f1 return true - they're the exact same object.

    2. When f2.intern() is called, since "hi" is already in the pool (it's f1's object), it returns that pooled instance. That's why:

      • i2 == f2 is false (different objects)
      • i1 == i2 is true (both point to the pooled instance)
    3. The final test with String str = "hi" confirms this understanding. Since string literals are always interned, str must reference the pooled instance of "hi". The fact that f1 == str returns true proves that f1's object really did become the pooled instance.

    This helped me understand what the documentation means by "this String object is added to the pool and a reference to this String object is returned" - it's literally using the original string object as the pooled instance rather than creating a copy.

    Using the byte array constructor (as shown in Progman's answer) made this especially clear because it guaranteed that the strings were initially only in the heap, not the pool, allowing us to see exactly what intern() does with the first and subsequent calls.