javazoneid

Many instances of java.time.ZoneRegion in Java heap. Isn't ZoneId instances supposed to be cached?


I have a lot of objects that have ZoneId tz field. All instances of tz field are created via ZoneId.of static method. My initial choice for ZoneId instead of plain String was the expectation that ZoneId instances would be cached (as there is a limited set of timezones). However, after analyzing the heap I found that is not true:

enter image description here

Every call to ZoneId.of with the same parameter creates a new instance of ZoneRegion:

ZoneId zoneId1 = ZoneId.of("Europe/Kiev"); //allocates new ZoneRegion
ZoneId zoneId2 = ZoneId.of("Europe/Kiev"); //allocates new ZoneRegion

My question is - is that expected or this is some kind of JVM bug?

openjdk version "21.0.1" 2023-10-17
OpenJDK Runtime Environment (build 21.0.1+12-Ubuntu-222.04)
OpenJDK 64-Bit Server VM (build 21.0.1+12-Ubuntu-222.04, mixed mode, sharing)

Solution

  • My question is - is that expected or this is some kind of JVM bug?

    Checking the most recent javadoc (JDK22u), 'these values are singletons' is mentioned nowhere. Because we can state your case a lot simpler without bringing heap issues into it:

    // bug claim:
    ZoneId a = ZoneId.of("Europe/Kiev");
    ZoneId b = ZoneId.of("Europe/Kiev");
    System.out.println(a == b); // should print true!
    

    ... but this prints false, and that therefore implies the mechanisms powering ZoneId.of do not lead to singleton ZoneId objects - either the JDK doesn't exist with every possible ZoneId object pre-cached, or, ZoneId.of itself fails to establish a cache.

    or this is some kind of JVM bug?

    Well, the Java Lang Spec and Java Virtual Machine Spec aren't going to (or should not) mention anything about this, as it's not in the java.lang package or sub-package, which is where any types that are so crucial to the language itself, they need mentioning in the JLS/JVMS go.

    Thus, the javadoc is the canonical source of the specification.

    ZoneId does not mention that they are singletons anywhere in the javadoc, nor does ZoneId.of mention this.

    For reference, current JDK22u sources of ZoneId.

    There is the line:

     * The ID is unique within the system.
    

    On ZoneId itself which one might interpret as: "ID" here is short for "an instance of ZoneId", but that's not a slamdunk obvious interpretation of that line; what they are talking about is that e.g. Europe/Kiev as a concept cannot lead to two different notions of what timezone that refers to at the same time. This part of the docs isn't about the implementation detail level notion of '... and the system shall cache this notion so that no two instances need to exist'.

    It does contain a rider that instances are to be treated as ValueBased (at the end of the main javadoc): You should not make an assumption either way on what e.g. == would do (in other words, they are explicitly reserving the right to introduce some sort of caching mechanism, or that these will become straight up valhalla style value classes that don't have an identity in the first place). But that simply stipulates you shouldn't rely on either: You cannot rely on repeated calls to ZoneId.of("Europe/Kiev") returning different objects, nor can you rely on such calls returning the same object. Code-wise you shouldn't be thinking of ZoneId instances of objects at all.

    A solution

    If your profiler report strongly suggests that reducing the number of ZoneId instances is going to help, then, the API doc gives you the right to cache these:

    private static final Map<String, ZoneId> CACHED_ZONE_IDS = new HashMap<>();
    
    public static ZoneId cachedOf(String id) {
      return CACHED_ZONE_IDS.computeIfAbsent(id, ZoneId::of);
    }
    

    If you're worried about needing to 'time out' ZoneIds no longer in active use from your cached map, you could look into guava's CacheBuilder, but that seems like overkill here; I'd just leave it as those 3 short lines of code above.