cachingweb-applicationsoptimizationmemcachedazure-caching

What should be stored in cache for web app?


I realize that this might be a vague question the bequests a vague answer, but I'm in need of some real world examples, thoughts, &/or best practices for caching data for a web app. All of the examples I've read are more technical in nature (how to add or remove cache data from the respective cache store), but I've not been able to find a higher level strategy for caching.

For example, my web app has an inbox/mail feature for each user. What I've been doing to date is storing typical session data in the cache. In this example, when the user logs in I go to the database and retrieve the user's mail messages and store them in cache. I'm beginning to wonder if I should just maintain a copy of all users' messages in the cache, all the time, and just retrieve them from cache when needed, instead of loading from the database upon login. I have a bunch of other data that's loaded on login (product catalogs and related entities) and login is starting to slow down.

So I guess my question to the community, is what would you do/recommend as an approach in this scenario?

Thanks.


Solution

  • This might be better suited to https://softwareengineering.stackexchange.com/, but generally you want to cache:

    The last item above is where you need to be careful as you can drastically increase your app's memory usage, by adding a few megabytes to the data for every active session. It also implies different levels of caching -- application wide, user session, etc.

    Generally you should NOT cache data that is under active change.

    In larger systems you also need to think about where the cache(s) will sit. Is it possible to have one central cache server, or is it good enough for each server/process to handle its own caching?

    Also: you should have some method to quickly reset/invalidate the cached data. For a smaller or less mission-critical app, this could be as simple as restarting the web server. For the large system that I work on, we use a 12 hour absolute expiration window for most cached data, but we have a way of forcing immediate expiration if we need it.

    This is a really broad question, and the answer depends heavily on the specific application/system you are building. I don't know enough about your specific scenario to say if you should cache all the users' messages, but instinctively it seems like a bad idea since you would seem to be effectively caching your entire data set. This could lead to problems if new messages come in or get deleted. Would you then update them in the cache? Would that not simply duplicate the backing store?

    Caching is only a performance optimization technique, and as with any optimization, measure first before making substantial changes, to avoid wasting time optimizing the wrong thing. Maybe you don't need much caching, and it would only complicate your app. Maybe the data you are thinking of caching can be retrieved in a faster way, or less of it can be retrieved at once.