pythonpandassocratasoda

how to download .csv file using API Endpoint in pandas


I want to download a csv file from an API endpoint with pandas. I am using the following code:

df=pd.read_csv('https://data.cityofnewyork.us/resource/nu7n-tubp.csv').

However, the resulting dataframe has only 1,000 rows, even though the dataset is much larger (around 121k rows). How can I download all the rows?

I tried to specify a number larger than 1,000 with nrows but I get the same result.


Solution

  • Socrata typically requires you to page through data, which is set at 1,000 rows. You could modify it by increasing it by using the $limit parameter. Based on the data set page, this is about 122k rows, so can use a limit of 130k to get them all:

    df=pd.read_csv('https://data.cityofnewyork.us/resource/nu7n-tubp.csv?$limit=130000')
    

    You also may want to explore the SodaPy library.