c++windows-subsystem-for-linuxc++-chronodriftubuntu-24.04

std::chrono now() induces very significant drift with Ubuntu 24.04 under WSL2


I've got an algorithm I've used for years to emulate a real-time clock in a non-real-time environment. It's always worked exceptionally well until used on Ubuntu 24.04 in WSL2 in Windows 11. The logic predicts in system clock time when each next time-step should occur and repeatedly calls chrono's now() in a loop until that point in time is reached, then continues with execution of the next frame. When set up in this environment, now()'s return value will occasionally jump a significant amount forward in time (e.g., ~25 seconds) between successive calls. Simply observing the application run reveals that its not a matter of now() taking ~25 seconds to return a value as the whole app will complete in ~5 seconds when it should have taken ~30. The implication is that calling now() has actually altered the clock's time. To verify this, I wrapped my sample code in a bash script where I could monitor and resync WSL's clock ... now() is indeed altering the clock's time.

The sample code has been built in this environment with both g++ and clang++, C++17 and C++20. Same effect observed in all four cases.

The sample code has been observed to work properly with no alteration to clock's time on a machine running Ubuntu-24.04 (older h/w) and under Windows11/WSL2/Ubuntu-22.04 (identical h/w)

I've found countless posts discussing a Linux clock drift problem when running under WSL. Most claim to have 'fixed' the problem by resyncing with the h/w clock. This does not 'fix' the drift problem, it merely resets it. There is clearly a bug somewhere in the system, could this be related? Is now() in some WSL configurations affecting the time and is my extreme case of hammering now() in a loop simply pronouncing the problem. How do I fix this?

reproducible example code:

#include <iostream>
#include <chrono>
#include <ctime>
#include <iomanip>
#include <sstream>

void print(std::chrono::duration<long double> elapsed_seconds, 
    std::chrono::_V2::system_clock::time_point, 
    std::chrono::duration<long double> error);

int main() 
{
    std::cout << "commanded_elapsed_time, corresponding_measured_time_points, error" << std::endl; 
    using namespace std::chrono_literals;
    auto start_tp = std::chrono::system_clock::now();
    auto dt = 1.0s;
    auto elapsed = 0.0s;
    auto next_tp = start_tp + elapsed;
    std::chrono::_V2::system_clock::time_point now;
    print(elapsed, start_tp, start_tp - next_tp);
    for (int t = 1; t < 30; t++)
    {
        elapsed = dt * t;
        next_tp = start_tp + elapsed;
        while((now = std::chrono::system_clock::now()) < next_tp) {}
        print(elapsed, now, now - next_tp);
    }
    return 0;
}

void print(
    std::chrono::duration<long double> elapsed_seconds, 
    std::chrono::_V2::system_clock::time_point tp, 
    std::chrono::duration<long double> error)
{
    std::time_t currentTime = std::chrono::system_clock::to_time_t(tp);
    std::tm* localTime = std::localtime(&currentTime);
    std::stringstream ss;
    ss << std::fixed << std::setprecision(3) << std::setfill('0') << std::setw(6) << elapsed_seconds.count() << ", ";
    ss << std::put_time(localTime, "%H:%M:%S");
    auto nowInMicroseconds = std::chrono::time_point_cast<std::chrono::microseconds>(tp);
    auto epoch = nowInMicroseconds.time_since_epoch();
    auto microseconds = std::chrono::duration_cast<std::chrono::microseconds>(epoch).count() % 1000000;
    ss << '.' << std::setfill('0') << std::setw(9) << microseconds * 1000 << ", ";
    ss << std::fixed << std::setprecision(9) << error.count();
    std::cout << ss.str() << std::endl;
}

test script:

#! /bin/bash
TRUE=0
FALSE=1
using_wsl=$FALSE
echo "kernel name:........" $(uname -s)
echo "kernel release:....." $(uname -r)
echo "kernel version:....." $(uname -v)
echo "machine:............" $(uname -m)
echo "processor:.........." $(uname -p)
echo "hardware platform..." $(uname -i)
echo "operating system...." $(uname -o)
if [ "$(systemd-detect-virt)" = "wsl" ]; then
    echo "running in wsl...... yes"
    using_wsl=$TRUE
    if ! command -v hwclock >/dev/null 2>&1; then
        echo ""
        echo "When running this script within WSL, the 'hwclock'"
        echo "command is used to sync the WSL clock with the"
        echo "host system's clock. 'hwclock' is not found on"
        echo "this system. Install and rerun test."
        exit 1
    fi
else
    echo "running in wsl...... no"
    using_wsl=$FALSE
fi
echo ""
if [ $using_wsl -eq $TRUE ]; then
    echo "WSL time before syncing with host...:" $(date +"%T.%6N")
    echo "syncing WSL clock with host ..."
    sudo hwclock -s
fi
start_time=$(date +"%s.%N")
echo "test system start time..............: $(date -d "@$start_time" +"%T.%6N")"
echo "test starting ..."
echo ""
./zero_drift_dt
end_test_time=$(date +"%s.%N")
echo ""
echo "... test complete"
echo "test system finish time............................: $(date -d "@$end_test_time" +"%T.%6N")"
if [ $using_wsl -eq $TRUE ]; then
    sudo hwclock -s
    echo "WSL time after syncing with host (again)...........:" $(date +"%T.%6N")
fi
final_time=$(date +"%s.%N")
expected_duration=$(awk "BEGIN {print $end_test_time - $start_time}")
actual_duration=$(awk "BEGIN {print $final_time - $start_time}")
error_duration=$(awk "BEGIN {print $end_test_time - $final_time}")
echo "Expected approximate test duration (s).............:" $expected_duration
echo "Measured test duration (s).........................:" $actual_duration
echo "During this test, now() moved clock forward (s)....:" $error_duration

Results on problem system:

~/Projects/proving_grounds/now_error$ sudo ./test.sh
[sudo] password for blumert:
kernel name:........ Linux
kernel release:..... 5.15.153.1-microsoft-standard-WSL2
kernel version:..... #1 SMP Fri Mar 29 23:14:13 UTC 2024
machine:............ x86_64
processor:.......... x86_64
hardware platform... x86_64
operating system.... GNU/Linux
running in wsl...... yes

WSL time before syncing with host...: 13:39:30.320309
syncing WSL clock with host ...
test system start time..............: 13:39:31.001844
test starting ...

commanded_elapsed_time, corresponding_measured_time_points, error
00.000, 13:39:31.003072000, 0.000000000
01.000, 13:39:32.003072000, 0.000000070
02.000, 13:39:52.464335000, 19.461263572
03.000, 13:39:52.464537000, 18.461465685
04.000, 13:39:52.464569000, 17.461497372
05.000, 13:39:52.464573000, 16.461501714
06.000, 13:39:52.464575000, 15.461503561
07.000, 13:39:52.464577000, 14.461505319
08.000, 13:39:52.464579000, 13.461506932
09.000, 13:39:52.464580000, 12.461508617
10.000, 13:39:52.464582000, 11.461510324
11.000, 13:39:52.464584000, 10.461512480
12.000, 13:39:52.464586000, 9.461514319
13.000, 13:39:52.464588000, 8.461515990
14.000, 13:39:52.464589000, 7.461517739
15.000, 13:39:52.464591000, 6.461519320
16.000, 13:39:52.464593000, 5.461520922
17.000, 13:39:52.464594000, 4.461522511
18.000, 13:39:52.464596000, 3.461524055
19.000, 13:39:52.464597000, 2.461525592
20.000, 13:39:52.464599000, 1.461527167
21.000, 13:39:52.464600000, 0.461528720
22.000, 13:39:57.473118000, 4.470046550
23.000, 13:39:57.473178000, 3.470106269
24.000, 13:39:57.473182000, 2.470110610
25.000, 13:39:57.473184000, 1.470112218
26.000, 13:39:57.473185000, 0.470113725
27.000, 13:40:02.476388000, 4.473316514
28.000, 13:40:02.476452000, 3.473380680
29.000, 13:40:02.476457000, 2.473385528

... test complete
test system finish time............................: 13:40:02.477504
WSL time after syncing with host (again)...........: 13:39:43.002630
Expected approximate test duration (s).............: 31.4757
Measured test duration (s).........................: 12.0018
During this test, now() moved clock forward (s)....: 19.4738

Solution

  • System clock is the wrong type to use, as it can (and does) vary. From cppreference (emphasis mine):

    Class std::chrono::system_clock represents the system-wide real time wall clock.

    It may not be monotonic: on most systems, the system time can be adjusted at any moment

    You want to use steady_clock

    Class std::chrono::steady_clock represents a monotonic clock. The time points of this clock cannot decrease as physical time moves forward and the time between ticks of this clock is constant. This clock is not related to wall clock time (for example, it can be time since last reboot), and is most suitable for measuring intervals