pythonhttprequestnetcdfopendap

How to download nasa satellite OPeNDAP data using python


I have tried requests, pydap, urllib, and netcdf4 and keep either getting redirect errors or permission errors when trying to download the following NASA data:

GLDAS_NOAH025SUBP_3H: GLDAS Noah Land Surface Model L4 3 Hourly 0.25 x 0.25 degree Subsetted V001 (http://disc.sci.gsfc.nasa.gov/uui/datasets/GLDAS_NOAH025SUBP_3H_V001/summary?keywords=Hydrology)

I am attempting to download about 50k files, here is an example of one, which works when pasted into google chrome browser (if you have proper username and password):

http://hydro1.gesdisc.eosdis.nasa.gov/daac-bin/OTF/HTTP_services.cgi?FILENAME=%2Fdata%2FGLDAS_V1%2FGLDAS_NOAH025SUBP_3H%2F2016%2F244%2FGLDAS_NOAH025SUBP_3H.A2016244.2100.001.2016256190725.grb&FORMAT=TmV0Q0RGLw&BBOX=-11.95%2C28.86%2C-0.62%2C40.81&LABEL=GLDAS_NOAH025SUBP_3H.A2016244.2100.001.2016286201048.pss.nc&SHORTNAME=GLDAS_NOAH025SUBP_3H&SERVICE=SUBSET_GRIB&VERSION=1.02&LAYERS=AAAB&DATASET_VERSION=001

Anyone have any experience getting OPeNDAP NASA data from the web using python? I am happy to provide more information if desired.

Here is the requests attempt which gives 401 error:

import requests

def httpdownload():
    '''loop through each line in the text file and open url'''
    httpfile = open(pathlist[0]+"NASAdownloadSample.txt", "r")
    for line in httpfile:
        print line 
        outname = line[-134:-122]+".hdf"
        print outname 
        username = ""
        password = "*"
        r = requests.get(line, auth=("username", "password"), stream=True)
        print r.text
        print r.status_code
        with open(pathlist[0]+outname, 'wb') as out:
             out.write(r.content)
        print outname, "finished" # keep track of progress

And here is the pydap example which gives redirect error:

import install_cas_client
from pydap.client import open_url

def httpdownload():
    '''loop through each line in the text file and open url'''
    username = ""
    password = ""
    httpfile = open(pathlist[0]+"NASAdownloadSample.txt", "r")
    fileone = httpfile.readline()
    filetot = fileone[:7]+username+":"+password+"@"+fileone[7:]
    print filetot
    dataset = open_url(filetot)

Solution

  • I did not find a solution using python, but given the information I have now it should be possible. I used wget with a .netrc file and cookie file shown as follows (https://disc.gsfc.nasa.gov/information/howto?title=How%20to%20Download%20Data%20Files%20from%20HTTP%20Service%20with%20wget):

    #!/bin/bash 
    
    cd # path to output files 
    touch .netrc
    echo "machine urs.earthdata.nasa.gov login <username> password <password>" >> .netrc
    chmod 0600 .netrc
    touch .urs_cookies
    wget --content-disposition --trust-server-names --load-cookies ~/.urs_cookies --save-cookies ~/.urs_cookies --auth-no-challenge=on --keep-session-cookies 
    -i <path to text file of url list>
    

    Hope it helps anyone else working with NASA data from this server.