ftpwgetaria2

Downloading in parallel using wget or aria2 in windows from FTP site


How can I download all files (in parallel) using wget2 or aria from here: ftp://ftp.soilgrids.org/data/recent/

I tried aria2c -j 8 ftp://ftp.soilgrids.org/data/recent/ but it does not do anything (and does not show any error message either)

I am on windows


Solution

  • wget is not multi-threaded, so you would need to somehow split the URLs in packages and invoke the program multiple times. On the other hand, aria2 is not able to recursively download. Since you're on Windows, I can't assume much useful besides cmd and the given wget and aria2.

    We can download the directory listing with wget and construct a text file with the URLs for aria2 to download in parallel. A little batch file will massage the data correspondingly:

    @ECHO OFF
    SETLOCAL EnableDelayedExpansion    
    SET host=ftp://ftp.soilgrids.org/data/recent
    DEL urls.txt
    
    REM fetch dirlisting from ftp
    wget --no-remove-listing !host!/
    
    FOR /F "tokens=1,9" %%G IN (.listing) DO (
        SET "modeflags=%%G"
        REM skip directories
        IF "x!modeflags:d=!"=="x!modeflags!" (
            ECHO !host!/%%H >> urls.txt
        )
    )
    
    REM cleanup
    DEL .listing.*
    DEL index.html.*
    

    Then, you can just do...

    aria2c -j8 -i urls.txt
    

    ...to download the files in parallel.