apachevarnishvarnish-vcl

Understanding reasons why Varnish isn't delivering cache'd version


I'm new to Varnish, and have recently set it up on my server—running MediaWiki—but I'm a little confused when trying to debug why it isn't delivering pages from the cache. I'm also using Ezoic Ads, which is notorious for delivering cookies and stuff like that.

I'm 90% sure that Ezoic is the reason pages aren't being delivered from the cache, as when I disable it, I am (or at least I'm sure I am) served a cache'd version of a page from the server. I'm just not entirely sure what Ezoic is doing that is stopping cache'd versions being served.

I'm using this as my default.vcl which is the same used by Wikipedia and other MediaWiki sites that use Varnish. I'm pretty sure from this I can gather that the settings tell Varnish to ignore all cookies unless they're session or token cookies, so I've got an idea that the cookies aren't the issue, but I could be wrong.

I've ran varnishlog -g request -q "(VCL_call eq 'MISS' or VCL_call eq 'PASS') and ReqUrl ~ '^/wiki/'" to see a log of why some pages are being passed back to the webserver instead of being served by Varnish, and this pastebin is an example of one of the requests.

Which part of the output is the reason that the request was passed back to the webserver? I can't seem to find any documentation that explains exactly where in the log to look for the reasoning as to why the request is being passed.


Solution

  • Unfortunately you only shared the BeReq backend log transaction. The actual decision-making is done in a Req client request log transaction.

    It would be helpful to add this to https://pastebin.com/4VAJ8cex, however the BeReq already indicates some cookie-related issues.

    The cookie header that is sent to the backend still contains a lot of tracking cookies, as you can see below:

    --  BereqHeader    Cookie: _ga=GA1.1.2033892632.1674510549; __gads=ID=6956ea7a576443c1-226b89437bdb000b:T=1674510549:S=ALNI_MbFWR631GOAzUPplF2CQE_vU79FlA; ezosuibasgeneris-1=ca67b864-64e7-47a2-622a-58234d258f12; ezCMPCCS=false; _pk_id.52.ebfe=890ac89b1e44b8b0.1674510619.;
    

    This is the VCL code you use to handle cookies:

    if (req.http.Authorization || req.http.Cookie ~ "([sS]ession|Token)=") {
        return (pass);
    }
    

    So while the decision to cache or not to cache is not impacted by the VCL code, these unsanitized tracking cookies still affect the way the objects are stored in the cache.

    Cache variations

    And this is all related to the following header that the application is returning:

    Vary: Accept-Encoding,DNT,Cookie
    

    This Vary header creates a cache variations for each value of the request header that is mentioned.

    The fact that Cookie is varied on, will result in a lot of versions of each page, because the tracking cookies often get different values.

    The solution

    My advice would be to modify the Vary header and remove the Cookie value from it and maybe also the DNT value. As matter of fact, you can remove the Vary header completely and rely on Varnish to send the proper Vary: Accept-Encoding.

    If you don't know how to configure this in your application or web server, you can also strip it in VCL:

    sub vcl_backend_response {
        if(beresp.http.Vary ~ "Cookie") {
            unset beresp.http.Vary;
        }
    }
    

    This vcl_backend_response subroutine can be added before the one in your current VCL template.

    An alternative solution

    If removing the Vary header is hard or causes unwanted side effects, you can also sanitize to cookies and strip off the tracking cookies.

    See https://www.varnish-software.com/developers/tutorials/removing-cookies-varnish for an official tutorial on how to remove cookies.

    However, in your case, this would be the VCL code:

    sub vcl_recv {
        if (req.http.Cookie) {
            set req.http.Cookie = ";" + req.http.Cookie;
            set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
            set req.http.Cookie = regsuball(req.http.Cookie, ";([sS]ession|Token)=", "; \1=");
            set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
            set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    
            if (req.http.cookie ~ "^\s*$") {
                unset req.http.cookie;
            }
        }
    }