While upgrading our project from Java 17 to Java 21, we noticed an increase in memory consumption. After dumping the heap and analyzing the differences, I found that there are thousands of empty strings stored in memory.
I succeeded in reproducing the issue with the following code:
import java.lang.management.ManagementFactory;
import java.text.DecimalFormat;
public class DecimalFormating {
static DecimalFormat decimalFormat = new DecimalFormat("#.##");
static DecimalFormat decimalFormat2 = new DecimalFormat();
public static void main(String[] args) {
Runtime.getRuntime().addShutdownHook(new Thread(() -> {
try {
String pid = ManagementFactory.getRuntimeMXBean().getName().split("@")[0];
Process p = Runtime.getRuntime().exec("D:\\JAVA\\jdk-17.0.2\\bin\\jmap.exe -dump:format=b,file=heapdump_string_decimal_17.hprof " + pid);
p.waitFor();
} catch (Exception e) {
e.printStackTrace();
}
}));
}
}
The following code is straightforward as it defines two instances of DecimalFormat, which in turn define multiple empty strings, as seen here and here. It then dumps the heap into a file.
I compiled and ran the code with both Java 17.0.2 and Java 21.0.6, and here is what the memory looks like:
Is this behavior normal? I can't find any mention of this kind of change in the release notes of Java between versions 18 and 21.
TL;DR: This will be fixed in Java 21.0.7(*), and has been fixed in Java 22.0.2 and Java 23 and later.
The problem is that the initialization with empty string you link is not actually used, as those fields are overwritten with the result of StringBuffer.toString()
(Java 17)/StringBuilder.toString()
(Java 21) calls in the applyPattern
method which is called from the DecimalFormat
constructors. The problem is that the toString()
method of StringBuffer
/StringBuilder
changed significantly with what is returned if the buffer is empty.
In Java 17 (17.0.14) it does:
@Override
@IntrinsicCandidate
public synchronized String toString() {
if (toStringCache == null) {
return toStringCache =
isLatin1() ? StringLatin1.newString(value, 0, count)
: StringUTF16.newString(value, 0, count);
}
return new String(toStringCache);
}
(Though oddly enough, if you'd called toString()
twice without modification, it would return a new instance).
This calls StringLatin1.newString
for an empty buffer, which returns the same empty string each time:
public static String newString(byte[] val, int index, int len) {
if (len == 0) {
return "";
}
return new String(Arrays.copyOfRange(val, index, index + len),
LATIN1);
}
In Java 21 (or at least, after Java 17), the implementation switched to StringBuilder
, and in Java 21 (21.0.6) the toString()
of StringBuilder
does:
@Override
@IntrinsicCandidate
public String toString() {
// Create a copy, don't share the array
return new String(this);
}
Which returns a new instance each and every time (though I didn't check if there is an intrinsic, and if so if it might do something else).
This was addressed in later Java versions. In Java 24 (24.0.0), StringBuilder.toString()
does:
@Override
@IntrinsicCandidate
public String toString() {
if (length() == 0) {
return "";
}
// Create a copy, don't share the array
return new String(this, null);
}
This is bug fix JDK-8325730 for Java 23 and backported to Java 22.0.2, and Java 21.0.7(*) (which hasn't been released yet). This issues has triggered additional discussion, see JDK-8332282 and JDK-8138614, as StringBuilder.toString()
explicitly says:
A new
String
object is allocated and initialized to contain the character sequence currently represented by this object.
And the fix doesn't actually conform as it doesn't return a new instance for an empty string buffer. The documentation will change in Java 25 to no longer require a new instance.
As far as I can tell from a quick look, before Java 15, the behaviour was similar to the Java 21 behaviour (returning new empty instances).
For example, Java 8 (8.0.442) does this in StringBuffer.toString()
:
@Override
public synchronized String toString() {
if (toStringCache == null) {
toStringCache = Arrays.copyOfRange(value, 0, count);
}
return new String(toStringCache, true);
}
And while the Java 11 implementation of StringBuffer.toString()
was the same as Java 17's, in Java 11 (11.0.26), StringLatin1.newString
always returned a new copy:
public static String newString(byte[] val, int index, int len) {
return new String(Arrays.copyOfRange(val, index, index + len),
LATIN1);
}
*: The backport issue JDK-8331299 lists the fix version as 21.0.7-oracle, so I'm not sure if this fix will also land in OpenJDK, or only in the Oracle builds.