linuxshellunixcurl

Parallel download using Curl command line utility


I want to download some pages from a website and I did it successfully using curl but I was wondering if somehow curl downloads multiple pages at a time just like most of the download managers do, it will speed up things a little bit. Is it possible to do it in curl command line utility?

The current command I am using is

curl 'http://www...../?page=[1-10]' 2>&1 > 1.html

Here I am downloading pages from 1 to 10 and storing them in a file named 1.html.

Also, is it possible for curl to write output of each URL to separate file say URL.html, where URL is the actual URL of the page under process.


Solution

  • Well, curl is just a simple UNIX process. You can have as many of these curl processes running in parallel and sending their outputs to different files.

    curl can use the filename part of the URL to generate the local file. Just use the -O option (man curl for details).

    You could use something like the following

    urls="http://example.com/?page1.html http://example.com?page2.html" # add more URLs here
    
    for url in $urls; do
       # run the curl job in the background so we can start another job
       # and disable the progress bar (-s)
       echo "fetching $url"
       curl $url -O -s &
    done
    wait #wait for all background jobs to terminate