amazon-web-servicesamazon-elasticacheamazon-dynamodb-streams

ElastiCache prepopulate when node starts


I've been taking a look at this AWS tutorial on building a Geospatial app on AWS. It's basically using DynamoDB to store the data, and has a Lambda function set up to write data onto Redis whenever an item is written in DynamoDB.

The solution makes sense, but I'd be very concerned with stuff failing. What would happen if, say, the Redis cluster restarted for some reason? Then the pre-existent data in DynamoDB wouldn't be in the cache and would never get to be there unless it's modified, right? Or am I missing something?

In case I'm right with my aforementioned analysis, then: what would be a good way to always prepopulate the cache whenever a node spins up? Is this possible?

P.D: The reason I'm going for DynamoDB + ElastiCache for Redis is the exact same that the video lists; it's cheaper


Solution

  • The tutorial doesn't go into the details of the Redis setup - ElastiCache for Redis can have multiple nodes to support replication in the same region as well as cross-region replication using Global Datastore. AWS' way of handling Redis restarts is by failing over a secondary node to primary and then rebooting/restarting the failed node. In the unlikely event that all nodes go down, you can enable persistence so that you can restore from the most recent/previously backed up data.

    Of course, if none of these are sufficient, you setup backup of your data and restore it out of band, either as a Fargate task or from an EC2 instance. It all depends on how complex you want it or how much you're willing to pay.