varnishvarnish-4

Varnish (?) interrupting file download


On a debian buster server I have an apache2 web server with varnish as a cache in front of it, actually varnish listens on port 80 and fetches from apache which listens on port 8080.

Maybe since upgrade from stretch, file downloads aren't completed any more.

Downloading with curl:

$ wget -v -t 1 https://myserver.org/myfile.zip
--2020-07-10 19:52:34--  https://myserver.org/myfile.zip
Resolving myserver.org (myserver.org)... w.x.y.z
Connecting to myserver.org (myserver.org)|w.x.y.z|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 333774113 (318M) [application/zip]
Saving to: 'myfile.zip.13'

myfile.zip.13                                                         1%[=>    ]   4.43M  8.29MB/s    in 0.5s    

2020-07-10 19:52:35 (8.29 MB/s) - Connection closed at byte 4649063. Giving up.

The transmitted amount varies at every request.

Here is the varnishlog for this request:

*   << Request  >> 2104598   
-   Begin          req 2104597 rxreq
-   Timestamp      Start: 1594404000.669087 0.000000 0.000000
-   Timestamp      Req: 1594404000.669087 0.000000 0.000000
-   ReqStart       127.0.0.1 40966 a0
-   ReqMethod      GET
-   ReqURL         /myfile.zip
-   ReqProtocol    HTTP/1.1
-   ReqHeader      User-Agent: Wget/1.19.4 (linux-gnu)
-   ReqHeader      Accept: */*
-   ReqHeader      Accept-Encoding: identity
-   ReqHeader      Host: myserver.org
-   ReqHeader      X-SSL: 1
-   ReqHeader      X-Forwarded-Proto: https
-   ReqHeader      X-Forwarded-For: a.b.c.d
-   ReqHeader      Connection: close
-   ReqUnset       X-Forwarded-For: a.b.c.d
-   ReqHeader      X-Forwarded-For: a.b.c.d, 127.0.0.1
-   VCL_call       RECV
-   ReqUnset       Accept-Encoding: identity
-   VCL_return     hash
-   VCL_call       HASH
-   VCL_return     lookup
-   VCL_call       MISS
-   VCL_return     fetch
-   Link           bereq 2104599 fetch
-   Timestamp      Fetch: 1594404000.669596 0.000509 0.000509
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Date: Fri, 10 Jul 2020 18:00:00 GMT
-   RespHeader     Last-Modified: Sun, 28 Jun 2020 14:46:46 GMT
-   RespHeader     ETag: "13e4fd21-5a92602e5bcc1"
-   RespHeader     Content-Length: 333774113
-   RespHeader     Content-Type: application/zip
-   RespHeader     X-Varnish: 2104598
-   RespHeader     Age: 0
-   RespHeader     Via: 1.1 varnish (Varnish/6.1)
-   VCL_call       DELIVER
-   RespHeader     X-Cache: uncached
-   RespUnset      Via: 1.1 varnish (Varnish/6.1)
-   RespUnset      X-Varnish: 2104598
-   VCL_return     deliver
-   Timestamp      Process: 1594404000.669609 0.000521 0.000012
-   RespHeader     Accept-Ranges: bytes
-   RespHeader     Connection: close
-   Timestamp      Resp: 1594404000.884468 0.215381 0.214859
-   ReqAcct        237 0 237 260 4649063 4649323
-   End

Here is the backend interaction (other wget run, closed at byte 6291456):

*   << BeReq    >> 2612416   
-   Begin          bereq 2612415 fetch
-   Timestamp      Start: 1594674394.093379 0.000000 0.000000
-   BereqMethod    GET
-   BereqURL       /myfile.zip
-   BereqProtocol  HTTP/1.1
-   BereqHeader    User-Agent: Wget/1.19.4 (linux-gnu)
-   BereqHeader    Accept: */*
-   BereqHeader    Host: myserver.org
-   BereqHeader    X-SSL: 1
-   BereqHeader    X-Forwarded-Proto: https
-   BereqHeader    X-Forwarded-For: a.b.c.d, 127.0.0.1
-   BereqHeader    Accept-Encoding: gzip
-   BereqHeader    X-Varnish: 2612416
-   VCL_call       BACKEND_FETCH
-   VCL_return     fetch
-   BackendOpen    28 default 127.0.0.1 8080 127.0.0.1 50218
-   BackendStart   127.0.0.1 8080
-   Timestamp      Bereq: 1594674394.093413 0.000034 0.000034
-   Timestamp      Beresp: 1594674394.093710 0.000330 0.000297
-   BerespProtocol HTTP/1.1
-   BerespStatus   200
-   BerespReason   OK
-   BerespHeader   Date: Mon, 13 Jul 2020 21:06:34 GMT
-   BerespHeader   Server: Apache
-   BerespHeader   Last-Modified: Sun, 28 Jun 2020 14:46:46 GMT
-   BerespHeader   ETag: "13e4fd21-5a92602e5bcc1"
-   BerespHeader   Accept-Ranges: bytes
-   BerespHeader   Content-Length: 333774113
-   BerespHeader   Content-Type: application/zip
-   TTL            RFC 120 10 0 1594674394 1594674394 1594674394 0 0 cacheable
-   VCL_call       BACKEND_RESPONSE
-   BerespUnset    Server: Apache
-   TTL            VCL 1800 10 0 1594674394 cacheable
-   TTL            VCL 1800 30 0 1594674394 cacheable
-   VCL_return     deliver
-   Filters        
-   Storage        malloc s0
-   Fetch_Body     3 length stream
-   ExpKill        LRU_Cand p=0x7f97b4a358c0 f=0x0 r=1
-   ExpKill        LRU x=2631322
-   ExpKill        LRU_Cand p=0x7f97efa67980 f=0x0 r=1
-   ExpKill        LRU x=2284258
-   ExpKill        LRU_Cand p=0x7f97b4a6b1c0 f=0x0 r=1
-   ExpKill        LRU x=2631334
-   ExpKill        LRU_Cand p=0x7f97efa67a40 f=0x0 r=1
-   ExpKill        LRU x=2284266
-   ExpKill        LRU_Cand p=0x7f97efa67b00 f=0x0 r=1
-   ExpKill        LRU x=2284269
-   ExpKill        LRU_Cand p=0x7f97d484c780 f=0x0 r=1
-   ExpKill        LRU x=1667855
-   ExpKill        LRU_Cand p=0x7f97d9af1bc0 f=0x0 r=1
-   ExpKill        LRU x=2950859
-   ExpKill        LRU_Cand p=0x7f97e7886040 f=0x0 r=1
-   ExpKill        LRU x=2472240
-   ExpKill        LRU_Cand p=0x7f97d9af1c80 f=0x0 r=1
-   ExpKill        LRU x=2950862
-   ExpKill        LRU_Cand p=0x7f97d484e640 f=0x0 r=1
-   ExpKill        LRU x=1667862
-   ExpKill        LRU_Cand p=0x7f97d484eac0 f=0x0 r=1
-   ExpKill        LRU x=1667865
-   ExpKill        LRU_Cand p=0x7f97d9af1e00 f=0x0 r=1
-   ExpKill        LRU x=2950865
-   ExpKill        LRU_Cand p=0x7f97d48d2180 f=0x0 r=1
-   ExpKill        LRU x=1667871
-   ExpKill        LRU_Cand p=0x7f97d9af2040 f=0x0 r=1
-   ExpKill        LRU x=2950871
-   ExpKill        LRU_Cand p=0x7f97d9af2100 f=0x0 r=1
-   ExpKill        LRU x=2950876
-   ExpKill        LRU_Cand p=0x7f97b4b11000 f=0x0 r=1
-   ExpKill        LRU x=2631343
-   ExpKill        LRU_Cand p=0x7f97efa67bc0 f=0x0 r=1
-   ExpKill        LRU x=2284275
-   ExpKill        LRU_Cand p=0x7f97efa67c80 f=0x0 r=1
-   ExpKill        LRU x=2284278
-   ExpKill        LRU_Cand p=0x7f97d48d3b00 f=0x0 r=1
-   ExpKill        LRU x=1667880
-   ExpKill        LRU_Cand p=0x7f97e7886100 f=0x0 r=1
-   ExpKill        LRU x=2472249
-   ExpKill        LRU_Cand p=0x7f97d9a3d3c0 f=0x0 r=1
-   ExpKill        LRU x=2950879
-   ExpKill        LRU_Cand p=0x7f97d9a3d900 f=0x0 r=1
-   ExpKill        LRU x=2950886
-   ExpKill        LRU_Cand p=0x7f97d48d4040 f=0x0 r=1
-   ExpKill        LRU x=1667895
-   ExpKill        LRU_Cand p=0x7f97e79d1e80 f=0x0 r=1
-   ExpKill        LRU x=2631367
-   ExpKill        LRU_Cand p=0x7f97d6bf6780 f=0x0 r=1
-   ExpKill        LRU x=1205206
-   ExpKill        LRU_Cand p=0x7f97cbb3c2c0 f=0x0 r=1
-   ExpKill        LRU x=2325555
-   ExpKill        LRU_Cand p=0x7f97d6bf6840 f=0x0 r=1
-   ExpKill        LRU x=1205217
-   ExpKill        LRU_Cand p=0x7f97e7886340 f=0x0 r=1
-   ExpKill        LRU x=2472273
-   ExpKill        LRU_Cand p=0x7f97cbb3c380 f=0x0 r=1
-   ExpKill        LRU x=2325569
-   ExpKill        LRU_Cand p=0x7f97cbb3c440 f=0x0 r=1
-   ExpKill        LRU x=2325572
-   ExpKill        LRU_Cand p=0x7f97d6bf69c0 f=0x0 r=1
-   ExpKill        LRU x=1205230
-   ExpKill        LRU_Cand p=0x7f97e7886400 f=0x0 r=1
-   ExpKill        LRU x=2472276
-   ExpKill        LRU_Cand p=0x7f97d6bf6b40 f=0x0 r=1
-   ExpKill        LRU x=1205239
-   ExpKill        LRU_Cand p=0x7f97b4b11480 f=0x0 r=1
-   ExpKill        LRU x=2631405
-   ExpKill        LRU_Cand p=0x7f97b4b11540 f=0x0 r=1
-   ExpKill        LRU x=2631410
-   ExpKill        LRU_Cand p=0x7f97e7886640 f=0x0 r=1
-   ExpKill        LRU x=2472294
-   ExpKill        LRU_Cand p=0x7f97b4b116c0 f=0x0 r=1
-   ExpKill        LRU x=2631416
-   ExpKill        LRU_Cand p=0x7f97e7886700 f=0x0 r=1
-   ExpKill        LRU x=2472299
-   ExpKill        LRU_Cand p=0x7f97b4b11780 f=0x0 r=1
-   ExpKill        LRU x=2631419
-   ExpKill        LRU_Cand p=0x7f97e78867c0 f=0x0 r=1
-   ExpKill        LRU x=2472302
-   ExpKill        LRU_Cand p=0x7f97e7886880 f=0x0 r=1
-   ExpKill        LRU x=2472305
-   ExpKill        LRU_Cand p=0x7f97b4b11b40 f=0x0 r=1
-   ExpKill        LRU x=2631428
-   ExpKill        LRU_Cand p=0x7f97e7886940 f=0x0 r=1
-   ExpKill        LRU x=2472308
-   ExpKill        LRU_Cand p=0x7f97e7886ac0 f=0x0 r=1
-   ExpKill        LRU x=2472314
-   ExpKill        LRU_Cand p=0x7f97b4b12080 f=0x0 r=1
-   ExpKill        LRU x=2631434
-   ExpKill        LRU_Cand p=0x7f97d6bf6c00 f=0x0 r=1
-   ExpKill        LRU x=1205267
-   ExpKill        LRU_Cand p=0x7f97e7886b80 f=0x0 r=1
-   ExpKill        LRU x=2472348
-   ExpKill        LRU_Cand p=0x7f97d6bf6cc0 f=0x0 r=1
-   ExpKill        LRU x=1205274
-   ExpKill        LRU_Cand p=0x7f97b4b12200 f=0x0 r=1
-   ExpKill        LRU x=2631441
-   ExpKill        LRU_Cand p=0x7f97e7886c40 f=0x0 r=1
-   ExpKill        LRU x=2472360
-   ExpKill        LRU reached nuke_limit
-   FetchError     Could not get storage
-   BackendClose   28 default
-   BereqAcct      244 0 244 230 10764288 10764518
-   End            

I can't undestand what's happening...

Here is my varnish's default.vcl:

# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;

# Default backend definition. Set this to point to your content server.
backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

# access control list for "purge": open to only localhost and other local nodes
acl purge {
    "127.0.0.1";
    "localhost";
}

# vcl_recv is called whenever a request is received 
sub vcl_recv {
        # Serve objects up to 2 minutes past their expiry if the backend
        # is slow to respond.
        #set req.grace = 120s;
        if (req.http.X-Forwarded-Proto !~ "https")
        {
          unset req.http.X-Forwarded-For;
          set req.http.X-Forwarded-For = client.ip;
        }
        set req.backend_hint= default;

        # This uses the ACL action called "purge". Basically if a request to
        # PURGE the cache comes from anywhere other than localhost, ignore it.
        if (req.method == "PURGE") 
            {
                if (!client.ip ~ purge)
                    {return(synth(405,"Not allowed."));}
                #return(hash);}
                return(purge);
            }

        # Pass any requests that Varnish does not understand straight to the backend.
        if (req.method != "GET" && req.method != "HEAD" &&
            req.method != "PUT" && req.method != "POST" &&
            req.method != "TRACE" && req.method != "OPTIONS" &&
            req.method != "DELETE") 
            {return(pipe);}     /* Non-RFC2616 or CONNECT which is weird. */

        # Pass anything other than GET and HEAD directly.
        if (req.method != "GET" && req.method != "HEAD")
           {return(pass);}      /* We only deal with GET and HEAD by default */

        # Pass requests from logged-in users directly.
        #if (req.http.Authorization || req.http.Cookie)
        #   {return(pass);}      /* Not cacheable by default */

        # Pass any requests with the "If-None-Match" header directly.
        if (req.http.If-None-Match)
           {return(pass);}

        # Force lookup if the request is a no-cache request from the client.
        # DISATTIVO PERCHÉ ban RICHIEDE UNA COMPARAZIONE
        #if (req.http.Cache-Control ~ "no-cache")
        #   {ban(req.url);}

        # normalize Accept-Encoding to reduce vary
        if (req.http.Accept-Encoding) {
          if (req.http.User-Agent ~ "MSIE 6") {
            unset req.http.Accept-Encoding;
          } elsif (req.http.Accept-Encoding ~ "gzip") {
            set req.http.Accept-Encoding = "gzip";
          } elsif (req.http.Accept-Encoding ~ "deflate") {
            set req.http.Accept-Encoding = "deflate";
          } else {
            unset req.http.Accept-Encoding;
          }
        }


        # remove the cookies form static files in order to serve them from cache
        if (req.url ~ "(?i)\.(jpeg|jpg|png|gif|ico|webp|js|css|txt|pdf|gz|zip|lzma|bz2|tgz|tbz|html|htm)$") {
            unset req.http.cookie;
        }

        # Do not cache HTTP authentication and HTTP Cookie
        if (req.http.Authorization || req.http.Cookie) {
            # Not cacheable by default
            return (pass);
        }

        return(hash);
        #return(pass);
}

sub vcl_pipe {
        # Note that only the first request to the backend will have
        # X-Forwarded-For set.  If you use X-Forwarded-For and want to
        # have it set for all requests, make sure to have:
        # set req.http.connection = "close";
 
        # This is otherwise not necessary if you do not do any request rewriting.
 
        set req.http.connection = "close";
}

# Called if the cache has a copy of the page.
sub vcl_hit {
        if (req.method == "PURGE") 
            {ban(req.url);
            return(synth(200,"Purged"));}
 
        if (!obj.ttl > 0s)
           {return(pass);}
}
 
# Called if the cache does not have a copy of the page.
sub vcl_miss {
        if (req.method == "PURGE") 
           {return(synth(200,"Not in cache"));}
}

# Called after a document has been successfully retrieved from the backend.
sub vcl_backend_response {
 
    # Remove some headers we never want to see
    unset beresp.http.Server;
    unset beresp.http.X-Powered-By;

    # For static content strip all backend cookies
    if (bereq.url ~ "\.(css|js|png|gif|jp(e?)g)|swf|ico") {
        unset beresp.http.cookie;
    }
    # Don't store backend
    if (bereq.url ~ "wp-(login|admin)" || bereq.url ~ "preview=true") {
        set beresp.uncacheable = true;
        set beresp.ttl = 30s;
        return (deliver);
    }


    # don't cache response to posted requests or those with basic auth
    if ( bereq.method == "POST" || bereq.http.Authorization ) {
            set beresp.uncacheable = true;
        set beresp.ttl = 120s;
        return (deliver);
        }

        # don't cache search results
    if ( bereq.url ~ "\?s=" ){
        set beresp.uncacheable = true;
                set beresp.ttl = 120s;
                return (deliver);
    }

    # only cache status ok
    if ( beresp.status != 200 ) {
        set beresp.uncacheable = true;
                set beresp.ttl = 120s;
                return (deliver);
    }

    # A TTL of 2h
    set beresp.ttl = 30m;
    # Define the default grace period to serve cached content
    set beresp.grace = 30s;
        return (deliver);
}

sub vcl_deliver {
    if (obj.hits > 0) {
        set resp.http.X-Cache = "cached";
    } else {
        set resp.http.x-Cache = "uncached";
    }

    # Remove some headers: PHP version
    unset resp.http.X-Powered-By;

    # Remove some headers: Apache version & OS
    unset resp.http.Server;

    # Remove some heanders: Varnish
    unset resp.http.Via;
    unset resp.http.X-Varnish;
    
    return(deliver);
}

#sub vcl_fetch {
#   if (req.url ~ "^/piwik") {
#       return(hit_for_pass);
#   }
#}

Solution

  • Based on the backend-side varnishlog information, I can conclude you are running out of storage.

    These are the log items that confirm this:

    -   ExpKill        LRU reached nuke_limit
    -   FetchError     Could not get storage
    

    When the cache is full and Varnish wants to store another object in cache, the LRU (Least Recently Used) mechanism kicks in.

    This means it starts looking for the least recently used items in cache, and evicts them from cache. Varnish does this a number of times until it hits the nuke_limit which is set to 50 by default.

    In this case, Varnish will try to remove the 50 least recently used objects in order to fit the new object in cache.

    My assumption is that the 50 least recently used objects are pretty small in terms of size, and the space that is freed up doesn't match the size requirements of the new object.

    These are some of your options: