Why does System.nanoTime() accumulate error over the day?

I'm trying to measure the latency between my Java publisher and an (industry standard) Cpp message broker.

The broker records the time it receives each message, and for the Java publisher I'm using the following code to get microsecond-accurate timestamps (which is stamped onto the outgoing message):

    private static final long NANOTIME_OFFSET;

    static {
        Instant instant = Instant.now();      // Get absolute time
        long nanoTime = System.nanoTime();    // Get relative time

        // Offset to convert from relative to absolute time
        NANOTIME_OFFSET = TimeUnit.SECONDS.toNanos(instant.getEpochSecond()) + instant.getNano() - nanoTime;
    }

    public static long currentTimeNanos() {
        return System.nanoTime() + NANOTIME_OFFSET;
    }

However, I'm noticing the measured latency between publisher and broker creep up over the course of the day and doesn't come back down until I restart my Java publisher.

The latency starts off at <<1ms, and suddenly jumps to 500ms. Further jumps take it closer to the 1sec mark, which is hard to believe since both processes reside on the same machine.

Profiling with VisualVM indicates there's no resource issue in my Java process, and packet capture with Wireshark confirms the issue is with the timestamps produced by currentTimeNanos().

So why does System.nanoTime() lose accuracy over the day?

I guess the correct thing to do is to always use Instant.now(), but it's a heavier call than System.nanoTime() (generates a new Instance object every time), so would add more of an overhead especially with a high volume of messages.

Edit: I'd also add I went with a nanoTime()-based clock since Instance.now() only yielded millisecond precision on my machine

Solution

Never mix `System.nanoTime` with `Instant.now`

The OpenJDK source code shows the System.nanoTime implementation as:

    @IntrinsicCandidate
    public static native long nanoTime();

That means the method is implemented as native code.

Without digging further, we know that nanoTime does not use the same source as Instant.now. The Instant class uses the date-time clock of the host OS. On conventional computer hardware (laptops, desktops, common servers) the date-time clock of the host OS resolves to microseconds at best, not nanoseconds. So the nanoTime feature must use another source for tracking elapsed nanos.

The CPU is the likely source. Modern CPUs have a regular “heartbeat” that keeps them running, timing their calculations. For example a 3 GHz CPU has three billion heartbeats per second. I imagine the nanoTime native code taps into this heartbeat for its count of elapsed nanoseconds.

The point here is that these two sources of time are completely separate and distinct.

The date-time clock of the host OS tracks date and time. In conventional computer hardware, this clock drifts. The drift is huge with respect to nanoseconds. The host OS frequently updates this clock after consulting with a time server, either locally or over the Internet. As the host OS corrects the inevitable drift, you will see the value of Instant.now jump, forward or backward(!). The jump may be minuscule or may be major, depending on (a) how long since the last drift-correction, and (b) the drifting tendency of your particular computer’s clock. (Another issue is the accuracy of your time server.†)
The CPU cycle clock knows nothing about calendars and wall-clock time. This clock is only useful to us for tracking nanoseconds elapsed between two events happening close in time.

You have mixed these two separate time sources with your code:

NANOTIME_OFFSET = TimeUnit.SECONDS.toNanos(instant.getEpochSecond()) + instant.getNano() - nanoTime;

The Instant class is a moving target, repeatedly being updated at any moment, jumping ahead or jumping backward. So your NANOTIME_OFFSET is naïve and invalid.

Naïve in that you assume a solidity and reliability to the date-time clock of the host OS that does not exist.
Invalid in that the two sources of time do not track together. As the Javadoc for nanoTime says: 👉🏽 ”This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time.”

For the sake of completeness, I’ll mention that System.currentTimeMillis (now legacy) is supplanted by Instant.now. Both use the the date-time clock of the host OS. Neither should be mixed with System.nanoTime.

† Nowadays, you can buy an atomic clock for much money (though perhaps less than you expect). A recent topic on YouTube is roll-your-own super-accurate very-cheap time servers based on capturing the current moment as broadcast by satellite navigation (GPS, etc.), then served by inexpensive computers such as Raspberry Pi. To go down this time-geek rabbit hole, you might start with the Jeff Geerling channels.

Why does System.nanoTime() accumulate error over the day?

Never mix System.nanoTime with Instant.now

Never mix `System.nanoTime` with `Instant.now`