httpcurlcookies

Curl cookie authentication


Trying to make my life easier by downloading files from archive.org which have been predefined in a list in a batch. If the file does not require authentication, then it works. However I am having problems downloading files that require to be logged in. Archive.org uses Cookie based auth, so I though the below would work:

curl --dump-header cookie.txt -u user:pass -H "Connection: keep-alive" https://archive.org/account/login
curl -v -b cookie.txt --location-trusted -O https://archive.org/download/mame-merged/mame-merged/3countb.zip

When I use --verbose I can see the cookies are there in the redirect CDN request, however the request is always unauthorised:

> Cookie: donation-identifier=3696401840959dfcc5e3b3f02cd51945; abtest-identifier=d6bb7a8a590bb5640d31ce8d27a112ce; test-cookie=1; ia-auth=fbda0754bc3108de49a9e811460d8e59
>
* Request completely sent off
0     0    0     0    0     0      0      0 --:--:--  0:00:03 --:--:--     0< HTTP/1.1 401 Unauthorized
< Server: nginx
< Date: Sun, 26 Jan 2025 15:56:03 GMT
< Content-Length: 172
< Connection: keep-alive

What am I missing?


Solution

  • Run curl to download cookies first, I think it has something to do with the CSRF token and then log in with POST:

    $ curl -s -c cookie.txt https://archive.org/account/login -o /dev/null
    $ curl -s -b cookie.txt -c cookie.txt -d 'username=<your@email.com>&password=<your-password>' https://archive.org/account/login | jq
    {
      "status": "ok",
      "message": "Successful login.",
      "referer": "https://archive.org/"
    }
    

    Now curl -v -b cookie.txt --location-trusted -O https://archive.org/download/mame-merged/mame-merged/3countb.zip will download a zip file:

    $ file 3countb.zip
    3countb.zip: Zip archive data, at least v2.0 to extract, compression method=deflate