javacachingcaffeine

Caffeine cache delayed entries expiration with after-creation policy


I'm using Caffeine lib for caching purposes with 1-sec expiration interval after entry creation. It turned out that entries expire with some delay sometimes it could take up to 2x more time to expire instead of configured 1-sec interval. Based on my experiments executor and scheduler threads count config has no effect on that delay.

There is my test snippet where I measure time spent in cache for the particular entry and print it out:

private final Cache<String, LinkedList<Pair<Long, Long>>> groupedEvents = Caffeine.newBuilder()
    .expireAfter(new Expiry<String, LinkedList<Pair<Long, Long>>>() {
        public long expireAfterCreate(String key, LinkedList<Pair<Long, Long>> val, long currentTime) {
            return TimeUnit.SECONDS.toNanos(1);
        }
        public long expireAfterUpdate(String key, LinkedList<Pair<Long, Long>> val, long currentTime, long currentDuration) {
            return currentDuration;
        }
        public long expireAfterRead(String key, LinkedList<Pair<Long, Long>> val, long currentTime, long currentDuration) {
            return currentDuration;
        }
    })
    .scheduler(Scheduler.forScheduledExecutorService(Executors.newScheduledThreadPool(4)))
    .executor(Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() * 2))
    .removalListener((key, events, cause) -> {
        long timeInCache = System.currentTimeMillis() - events.get(0).getLeft();
        System.out.println(String.format("key %s, values count: %s, timeInCache: %s ms",
                key, events.size(), timeInCache));
    }).build(); 

@Test
public void test() throws InterruptedException {

    for (int i = 0; i < 1000; i++) {
        Thread.sleep(30);
        String key = String.valueOf(new Random().nextInt(30));

        groupedEvents.asMap().compute(key, (s, values) -> {
            if (values == null) {
                values = new LinkedList<>();
            }
            values.add(Pair.of(System.currentTimeMillis(), 123L));
            return values;
        });
    }

Output is the following:

key 10, values count: 1, timeInCache: 1223 ms
key 0, values count: 3, timeInCache: 1470 ms
key 20, values count: 1, timeInCache: 2295 ms
key 15, values count: 2, timeInCache: 2325 ms
key 16, values count: 1, timeInCache: 2023 ms
key 23, values count: 1, timeInCache: 1566 ms
key 18, values count: 1, timeInCache: 2052 ms
key 14, values count: 2, timeInCache: 1079 ms
key 3, values count: 3, timeInCache: 1628 ms
key 4, values count: 2, timeInCache: 1109 ms
key 2, values count: 2, timeInCache: 1841 ms
key 17, values count: 1, timeInCache: 1535 ms
key 13, values count: 2, timeInCache: 1011 ms
key 7, values count: 1, timeInCache: 1471 ms
key 12, values count: 1, timeInCache: 1441 ms

As you see the load is not high and around 33 entry addition per sec (based on the Thread.sleep(30)) but for some entries, it takes up to ~2300ms instead of the desired 1sec to expire after creation and that delay is critical for my application since the end-user will not get my data within 1-1.3 sec interval.

Is there any chance to tune that entry eviction time to reduce the delay? Im using latest 'com.github.ben-manes.caffeine:caffeine:2.8.5'' version


Solution

  • The scheduled executions are paced at a minimum interval of ~1.07 seconds, as described in Caffeine.scheduler:

    Specifies the scheduler to use when scheduling routine maintenance based on an expiration event. This augments the periodic maintenance that occurs during normal cache operations to allow for the prompt removal of expired entries regardless of whether any cache activity is occurring at that time. By default, {@link Scheduler#disabledScheduler()} is used.

    The scheduling between expiration events is paced to exploit batching and to minimize executions in short succession. This minimum difference between the scheduled executions is implementation-specific, currently at ~1 second (2^30 ns). In addition, the provided scheduler may not offer real-time guarantees (including {@link ScheduledThreadPoolExecutor}). The scheduling is best-effort and does not make any hard guarantees of when an expired entry will be removed.

    The long durations is due to the entry expiring sometime after the last maintenance run. This can result in a maximum lifetime of 2x the pace interval, e.g. when the next entry to expire is just 1ns in the future. This is extended a bit longer because the underlying system doesn't provide real-time guarantees, so it lost another dozen milliseconds in your example.

    To offer O(1) expiration, the cache holds entries in a timing wheel where the smallest bucket duration is 1.07s. To avoid constantly re-running the maintenance it paces the executions by this minimum threshold. Even if this pacer was removed, the worst-case lifetime would remain because of the bucket size, as an entry ready to be expired must wait for the entire bucket to be evictable.

    Therefore the only way to reduce this worst-case would be to add a wheel using a smaller duration. That wheel might have 2 buckets at 2^29ns (530ms), 4 at 2^28ns (268ms), etc. The worst case delay would be the bucket's duration, so we would have to determine what new value is acceptable. Please open a Github issue if you want to explore this option.

    That explains the technical details and a possible improvement. However it is a balancing act if another user wants an even finer resolution. At the extreme this leads a perfect resolution, requiring an O(lg n) cost to maintain a sorted order which degrades performance for very large caches. As we try to balance design tradeoffs, we might cause some scenarios not being a good fit and it is perfectly fine to consider alternatives better suited for your trade-offs.

    You might, for example, prefer to insert every entry into an ScheduledExecutorService instead of relying on Caffeine's expiration. Your code would schedule the removal in the compute. That has a worst-case delay of an extra 13ms in my test runs.

    groupedEvents.asMap().compute(key, (s, values) -> {
      if (values == null) {
        scheduledExecutor.schedule(
          () -> groupedEvents.invalidate(key), 1, TimeUnit.SECONDS);
        values = new LinkedList<>();
      }
      values.add(Pair.of(System.currentTimeMillis(), 123L));
      return values;
    });