Currently we have a service creating XML pages based on various get parameters. As the number of parameters has increased, and number of different combinations has also increasing meaning the hit rate in our varnish cache has fell. We've increased the TTL, and hence the hit rate has increased but I'm toying with the following thought:
I just came across Edge Side Includes and am thinking.. If I produce pages of XML containing 50 elements each time, could I generate a page with 50 ESI(s) which varnish will then combine into one document?
Why 50 ESI elements you ask? Because each XML element itself is very easily cached by one URL, but the combination of the filters cause a multitude of different complete XML documents to be generated.
So, even if one request filters out the first 10 XML elements (because they don't confirm to the get params), because ESIs are used, each element will be fetched from the cache.
How heavy would this be on the server? Does it make sense to do this? Is ESI very expensive in which case it wouldn't make sense.
Update
First off, we have never run out of memory and Nuke is zero. We currently have a hit/miss ration of 0,4 with a ttl of 4 hours, which is terrible in my opinion... due to all of these combinations (countries, locales, etc). Worse still, tomcat has gone to 100% utilization and hung while varnish stays at a study 1-3%. My gut feeling says that having varnish stitch the ESI, and remember the subdocuments will protect tomcat even more and increase our capacity. We've never had to Nuke items strangely which means with the ~ 1GB cache it never fills before cache entries expire. I'm sure if we cache each sub-document, we may reach the memory limit and start nuking items... but doesn't varnish use some kind of least recently used algorithm?
It's generally not the best decision to blanket cache collections for which there are tons of different combinations of queries. Chances are, certain query combinations are accessed much more often than others (for instance combinations for which there's lots of SEO juice, or combinations that you distribute/share links to/have links to on your site, or are just more relevant to your users), so those should be selectively cached. The problem with just caching everything for a long ttl is if the url space is big, is that you may run out of memory and nuke resources that are frequently accessed in favor of caching things that are infrequently accessed.
There is no limit of ESI includes per page, and the approach you describe is a good strategy assuming the hit rate on the xml subdocuments will be very high. Cache hits in varnish are very lightweight so even if a page is a composite of 50 cache hits I think it will perform quite well compared to no caching. If the hit rate on esi included subdocuments is low, and there are tons of them on each page, it will result in worse performance than just having the backend rendering the subdocuments each time. I would definitely recommend doing some load testing on the following scenarios so that you can make an educated decision:
This will give a nice picture of how performance will degrade as your hit rate goes down (it may not be linear, hence doing 0%, 50%, 100%) and also tell you how much caching can improve the performance in theory. To me it seems likely that the best solution is some combination of esi:including fragments in a "working set" of regularly accessed subdocuments and rendering the subdocuments directly on the backend if they are not in the working set.