I am trying to configure persistent HTTP caching using the org.apache.http.impl.client.cache.CachingHttpClients
builder. However, when I configure a cache directory, the cache never seems to be read back from disk.
I tried to setup persistent caching using setCacheDir
, i.e.,
CachingHttpClients.custom()
.setCacheDir(cacheDir)
.setDeleteCache(false)
.build();
(see below for a complete example)
The behaviour I'm seeing:
cacheDir
of the form 1703170640727.0000000000000001-997b0365.User.-url-path
. So far so good.It seems that the cache entries that were written to disk are not being picked up after a restart, and I haven't been able to find a way to do so.
How do I initialize Apache's HTTP cache, so caching persists after restarts?
Minimal reproducible example. Running this multiple times results in a "Cache miss" every time, although there are cache entries being written to disk. I would expect reruns to use the cache that was written to disk. Note that I do see a cache hit if I perform two requests to the same URL within the same run.
File cacheDir = Path.of(System.getProperty("java.io.tmpdir")).resolve("my-http-cache").toFile();
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new RuntimeException("Could not create cache directory " + cacheDir + ".");
}
try (var client = CachingHttpClients.custom()
.setCacheDir(cacheDir)
.setDeleteCache(false)
.useSystemProperties()
.build()) {
HttpCacheContext context = HttpCacheContext.create();
CloseableHttpResponse response = client.execute(new HttpGet("https://api.github.com/repos/finos/common-domain-model"), context);
CacheResponseStatus responseStatus = context.getCacheResponseStatus();
switch (responseStatus) {
case CACHE_HIT:
System.out.println("Cache hit!");
break;
case CACHE_MODULE_RESPONSE:
System.out.println("The response was generated directly by the caching module");
break;
case CACHE_MISS:
System.out.println("Cache miss!");
break;
case VALIDATED:
System.out.println("Cache hit after validation");
break;
}
}
Apache's HTTP caching will keep track of a cache entry for each eligible HTTP response. This cache entry points to a certain abstract "resource" object, which holds the cached response. By using CachingHttpClients.custom().setCacheDir(cacheDir)
, this resource will be a file, i.e., responses will be saved to disk, rather than kept in memory, which saves on memory usage. However, the cache entries themselves are still kept in-memory, so they will not survive a restart.
The following implementation could be used to persist cache entries as well:
/**
* A variant of {@link org.apache.http.impl.client.cache.ManagedHttpCacheStorage}
* that persists after start-ups.
*/
@Contract(threading = ThreadingBehavior.SAFE)
public class PersistentHttpCacheStorage extends ManagedHttpCacheStorage {
private static final Logger LOGGER = LoggerFactory.getLogger(PersistentHttpCacheStorage.class);
private static final String ENTRIES_FILE_NAME = "ENTRIES";
private Map<String, HttpCacheEntry> entries;
private final File cacheDir;
private final File entriesFile;
public PersistentHttpCacheStorage(CacheConfig config, File cacheDir) {
super(config);
this.cacheDir = cacheDir;
this.entriesFile = new File(cacheDir, ENTRIES_FILE_NAME);
// A hack to access the entries of the super class.
try {
Field f = ManagedHttpCacheStorage.class.getDeclaredField("entries");
f.setAccessible(true);
this.entries = (Map<String, HttpCacheEntry>) f.get(this);
} catch (NoSuchFieldException | IllegalAccessException e) {
throw new RuntimeException(e);
}
}
public void initialize() {
try {
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new RuntimeException("Could not create cache directory " + cacheDir + ".");
}
if (entriesFile.exists()) {
try (ObjectInputStream in = new ObjectInputStream(new FileInputStream(entriesFile))) {
Map<String, HttpCacheEntry> persistentEntries = (Map<String, HttpCacheEntry>) in.readObject();
this.entries.putAll(persistentEntries);
LOGGER.debug("Read " + this.entries.size() + " HTTP entries from cache.");
}
} else {
LOGGER.debug("No cached entries exist. Creating a new file at " + entriesFile + ".");
if (!entriesFile.createNewFile()) {
throw new RuntimeException("Could not create entries file " + entriesFile + ".");
}
}
} catch (IOException | ClassNotFoundException e) {
throw new RuntimeException(e);
}
}
private void writeEntries() throws IOException {
synchronized (this) {
try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(entriesFile))) {
out.writeObject(entries);
}
}
}
@Override
public void putEntry(String key, HttpCacheEntry entry) throws IOException {
super.putEntry(key, entry);
writeEntries();
}
@Override
public HttpCacheEntry getEntry(String key) throws IOException {
return super.getEntry(key);
}
@Override
public void removeEntry(String key) throws IOException {
super.removeEntry(key);
writeEntries();
}
@Override
public void updateEntry(String key, HttpCacheUpdateCallback callback) throws IOException {
super.updateEntry(key, callback);
writeEntries();
}
@Override
public void shutdown() {
super.shutdown();
if (!entriesFile.delete()) {
LOGGER.error("Could not delete entries file " + entriesFile + ".");
}
}
}
Usage:
CacheConfig cacheConfig = CacheConfig.DEFAULT;
File cacheDir = Path.of(System.getProperty("java.io.tmpdir")).resolve("my-http-cache").toFile();
if (!cacheDir.exists() && !cacheDir.mkdirs()) {
throw new RuntimeException("Could not create cache directory " + cacheDir + ".");
}
PersistentHttpCacheStorage storage = new PersistentHttpCacheStorage(cacheConfig, cacheDir);
storage.initialize(); // Necessary for loading the persisted cache entries
CloseableHttpClient client = CachingHttpClients.custom()
.setCacheConfig(cacheConfig)
.setHttpCacheStorage(storage)
.setCacheDir(cacheDir)
.setDeleteCache(false)
.useSystemProperties()
.build();