azureredisazure-redis-cacheredis-cachecachemanager

Connection to Redis cache fails after restart - Azure


We are using following code to connect to our caches (in-memory and Redis):

settings .WithSystemRuntimeCacheHandle() .WithExpiration(CacheManager.Core.ExpirationMode.Absolute, defaultExpiryTime) .And .WithRedisConfiguration(CacheManagerRedisConfigurationKey, connectionString) .WithMaxRetries(3) .WithRetryTimeout(100) .WithJsonSerializer() .WithRedisBackplane(CacheManagerRedisConfigurationKey) .WithRedisCacheHandle(CacheManagerRedisConfigurationKey, true) .WithExpiration(CacheManager.Core.ExpirationMode.Absolute, defaultExpiryTime);

It works fine, but sometimes machine is restarted (automatically by Azure where we host it) and after the restart connection to Redis fails with following exception:

Connection to '{connection string}' failed. at CacheManager.Core.BaseCacheManager`1..ctor(String name, ICacheManagerConfiguration configuration) at CacheManager.Core.BaseCacheManager`1..ctor(ICacheManagerConfiguration configuration) at CacheManager.Core.CacheFactory.Build[TCacheValue](String cacheName, Action`1 settings) at CacheManager.Core.CacheFactory.Build(Action`1 settings)

According to Redis FAQ (https://learn.microsoft.com/en-us/azure/redis-cache/cache-faq) part: "Why was my client disconnected from the cache?" it might happen after redeploy.

The question is

We are sure the connection string is OK


Solution

  • Most clients (including StackExchange.Redis) usually connect / re-connect automatically after a connection break. However, your connect timeout setting needs to be large enough for the re-connect to happen successfully. Remember, you only connect once, so it's alright to give the system enough time to be able to reconnect. Higher connect timeout is especially useful when you have a burst of connections or re-connections after a blip causing CPU to spike and some connections might not happen in time.

    In this case, I see RetryTimeout as 100. If this is the Connection timeout, check if this is in milliseconds. 100 milliseconds is too low. You might want to make this more like 10 seconds (remember it's a one time thing, so you want to give it time to be able to connect).