javamultithreadingcachingguavagoogle-guava-cache

How to reliably drop records from Guava LoadingCache?


I am using Guava LoadingCache to populate some data into it and I want to remove all the entries from that LoadingCache every 1 minute.

public class MetricHolder {
  private final ExecutorService executor = Executors.newFixedThreadPool(2);
  private final LoadingCache<String, AtomicLongMap<String>> clientIdMetricCounterCache =
      CacheBuilder.newBuilder().expireAfterWrite(1, TimeUnit.MINUTES)
          .removalListener(RemovalListeners.asynchronous(new SendToDatabase(), executor))
          .build(new CacheLoader<String, AtomicLongMap<String>>() {
            @Override
            public AtomicLongMap<String> load(String key) throws Exception {
              return AtomicLongMap.create();
            }
          });

  private static class Holder {
    private static final MetricHolder INSTANCE = new MetricHolder();
  }

  public static MetricHolder getInstance() {
    return Holder.INSTANCE;
  }

  private MetricHolder() {}

  public void increment(String clientId, String name) throws ExecutionException {
    clientIdMetricCounterCache.get(clientId).incrementAndGet(name);
  }

  public LoadingCache<String, AtomicLongMap<String>> getClientIdMetricCounterCache() {
    return clientIdMetricCounterCache;
  }

  private static class SendToDatabase implements RemovalListener<String, AtomicLongMap<String>> {
    @Override
    public void onRemoval(RemovalNotification<String, AtomicLongMap<String>> notification) {
      String key = notification.getKey();
      AtomicLongMap<String> value = notification.getValue();
      System.out.println(key);
      System.out.println(value);
      // sending these key/value to some other system

    }
  }
}

I am calling increment method from lot of different places in the code in a multithreaded way. So for a period of 1 minute it will populate lot of metrics in clientIdMetricCounterCache. Now I want to drop all those metrics reliably after every 1 minute and send all those metrics to database.

In my case, sometimes write to increment method might be very slow but still I want to drop all those entries every 1 minute and I am not doing any read at all on this cache, just writing to it and then dropping those records by sending to some other system. Below is what I saw in the Guava wiki

Caches built with CacheBuilder do not perform cleanup and evict values "automatically," or instantly after a value expires, or anything of the sort. Instead, it performs small amounts of maintenance during write operations, or during occasional read operations if writes are rare.

So how does expireAfterWrite works? Does it work like a scheduler which will run every 1 minute and delete all the entries whatever is there in clientIdMetricCounterCache and then again it will wake up after 1 minute and delete all the entries from the same cache and keep going like that? After reading the wiki, I doubt it works like that. If it doesn't, then how can I reliably drop those records every 1 minute and send to some other system as my writes can be rare for some time?

Looks like I may have to use Guava TimeLimiter interface and SimpleTimeLimiter or may be ScheduledExecutorService to reliably timeout the call and then drop the entries? If yes, can anyone provide an example how will this work in my current example?


Solution

  • To me, it looks like you're misusing the cache, where a Map would do. You're using no expiration, no size limit, no caching, you're just gathering stats.

    About the only feature you're using is the loading aspect, and that's not really worth it.

    I'd suggest to use an AtomicReference<ConcurrentHashMap<String, AtomicLongMap>> instead:

    With the replacement, you can be sure that your stats don't interfere, however, you should wait a bit after getAndSet as there may be threads who just obtained the reference and are about to write.

    It will produce more garbage than the original approach, but all the garbage will be short living, so you might actually make the GC more happy.

    It's simple and need no deep knowledge of a library or its implementation details.


    I guess, volatile instead of AtomicReference would also do.