javatimenanotimecurrent-time

Why does System.nanoTime() accumulate error over the day?


I'm trying to measure the latency between my Java publisher and an (industry standard) Cpp message broker.

The broker records the time it receives each message, and for the Java publisher I'm using the following code to get microsecond-accurate timestamps (which is stamped onto the outgoing message):

    private static final long NANOTIME_OFFSET;

    static {
        Instant instant = Instant.now();      // Get absolute time
        long nanoTime = System.nanoTime();    // Get relative time

        // Offset to convert from relative to absolute time
        NANOTIME_OFFSET = TimeUnit.SECONDS.toNanos(instant.getEpochSecond()) + instant.getNano() - nanoTime;
    }

    public static long currentTimeNanos() {
        return System.nanoTime() + NANOTIME_OFFSET;
    }

However, I'm noticing the measured latency between publisher and broker creep up over the course of the day and doesn't come back down until I restart my Java publisher.

The latency starts off at <<1ms, and suddenly jumps to 500ms. Further jumps take it closer to the 1sec mark, which is hard to believe since both processes reside on the same machine.

enter image description here

Profiling with VisualVM indicates there's no resource issue in my Java process, and packet capture with Wireshark confirms the issue is with the timestamps produced by currentTimeNanos().

So why does System.nanoTime() lose accuracy over the day?

I guess the correct thing to do is to always use Instant.now(), but it's a heavier call than System.nanoTime() (generates a new Instance object every time), so would add more of an overhead especially with a high volume of messages.

Edit: I'd also add I went with a nanoTime()-based clock since Instance.now() only yielded millisecond precision on my machine


Solution

  • Never mix System.nanoTime with Instant.now

    The OpenJDK source code shows the System.nanoTime implementation as:

        @IntrinsicCandidate
        public static native long nanoTime();
    

    That means the method is implemented as native code.

    Without digging further, we know that nanoTime does not use the same source as Instant.now. The Instant class uses the date-time clock of the host OS. On conventional computer hardware (laptops, desktops, common servers) the date-time clock of the host OS resolves to microseconds at best, not nanoseconds. So the nanoTime feature must use another source for tracking elapsed nanos.

    The CPU is the likely source. Modern CPUs have a regular “heartbeat” that keeps them running, timing their calculations. For example a 3 GHz CPU has three billion heartbeats per second. I imagine the nanoTime native code taps into this heartbeat for its count of elapsed nanoseconds.

    The point here is that these two sources of time are completely separate and distinct.

    You have mixed these two separate time sources with your code:

    NANOTIME_OFFSET = TimeUnit.SECONDS.toNanos(instant.getEpochSecond()) + instant.getNano() - nanoTime;
    

    The Instant class is a moving target, repeatedly being updated at any moment, jumping ahead or jumping backward. So your NANOTIME_OFFSET is naïve and invalid.


    For the sake of completeness, I’ll mention that System.currentTimeMillis (now legacy) is supplanted by Instant.now. Both use the the date-time clock of the host OS. Neither should be mixed with System.nanoTime.


    † Nowadays, you can buy an atomic clock for much money (though perhaps less than you expect). A recent topic on YouTube is roll-your-own super-accurate very-cheap time servers based on capturing the current moment as broadcast by satellite navigation (GPS, etc.), then served by inexpensive computers such as Raspberry Pi. To go down this time-geek rabbit hole, you might start with the Jeff Geerling channels.