I'm trying to retrieve monthly average rainfall from 2004 to 2021 based on CHIRPS data by district, using a shapefile I imported from my drive. So far, I am using the following code in Google Colab:
path = "/content/drive/.../x.shp"
districts = gpd.read_file(path)
startDate = ee.Date('2004-01-01')
endDate = ee.Date('2021-12-31')
chirps = ee.ImageCollection('UCSB-CHG/CHIRPS/DAILY').filterDate(startDate, endDate).select("precipitation")
# Reduce the rainfall data to the district polygons
def reduce_image(img):
img_reduced = img.reduceRegions(
collection=districts,
reducer=ee.Reducer.mean(),
scale=5500
)
return img_reduced
rainfall_reduced = chirps.map(reduce_image).flatten()
... but I get an error message saying
EEException: Unrecognized argument type to convert to a FeatureCollection
Also, when I try adding
.featureBounds(districts)
to the chirps import, I get an error message saying
EEException: Invalid GeoJSON geometry.
I have tried changing the code for hours but don't seem to be able to make it work.
Could anyone tell me how I can calculate monthly average precipitation for each district, and ultimately download them as a .csv file?
Thank you very much in advance!
We need to get the 'features'
from the shapefiles using to_json()
and a few other things to get it as a ee.FeatureCollection
:
file_name = '/content/drive/My Drive/Colab Notebooks/DATA_FOLDERS/SHP/gadm41_PRY_2.shx'
districts = gpd.read_file(file_name)
fc = []
for i in range(districts.shape[0]):
g = districts.iloc[i:i + 1, :]
json_dict = eval(g.to_json())
geo_json_dict = json_dict['features'][0]
fc.append(ee.Feature(geo_json_dict))
districts = ee.FeatureCollection(fc)
We also need to use mosaic()
on the ee.ImageCollection
or chirps
:
chirps = ee.ImageCollection('UCSB-CHG/CHIRPS/DAILY').filterDate(startDate, endDate).select("precipitation").mosaic()
For reduceRegions()
we need to use ee.Image()
and getInfo()
:
def reduce_image(img):
img_reduced = ee.Image(img).reduceRegions(
reducer=ee.Reducer.mean(),
collection=districts,
scale=5500,
).getInfo()
return img_reduced
Altogether we have:
file_name = '/content/drive/My Drive/Colab Notebooks/DATA_FOLDERS/SHP/gadm41_PRY_2.shx'
districts = gpd.read_file(file_name)
fc = []
for i in range(districts.shape[0]):
g = districts.iloc[i:i + 1, :]
json_dict = eval(g.to_json())
geo_json_dict = json_dict['features'][0]
fc.append(ee.Feature(geo_json_dict))
districts = ee.FeatureCollection(fc)
#startDate = ee.Date('2004-01-01')
startDate = ee.Date('2020-01-01')
endDate = ee.Date('2021-12-31')
chirps = ee.ImageCollection('UCSB-CHG/CHIRPS/DAILY').filterDate(startDate, endDate).select("precipitation").mosaic()
def reduce_image(img):
img_reduced = ee.Image(img).reduceRegions(
reducer=ee.Reducer.mean(),
collection=districts,
scale=5500,
).getInfo()
return img_reduced
rainfall_reduced = reduce_image(chirps)
print(rainfall_reduced)
outputs (I only included a subset since it's a million lines):
...
[-56.286949156999924, -24.767101287999935],
[-56.28482055699993, -24.76206016599997],
[-56.28269958499993, -24.757202148999852],
[-56.28142547599998, -24.754322050999917],
[-56.28020477399997, -24.751232146999882],
[-56.28010559099994, -24.751117705999945]]]},
'id': '217',
'properties': {'CC_2': 'NA',
'COUNTRY': 'Paraguay',
'ENGTYPE_2': 'District',
'GID_0': 'PRY',
'GID_1': 'PRY.18_1',
'GID_2': 'PRY.18.15_1',
'HASC_2': 'PY.SP.YN',
'NAME_1': 'San Pedro',
'NAME_2': 'Yataity del Norte',
'NL_NAME_1': 'NA',
'NL_NAME_2': 'NA',
'TYPE_2': 'Distrito',
'VARNAME_2': 'NA',
'mean': 0}}]}
Note: I had to limit the dates range (i.e. startDate
) because it's... a lot of data and I get this error/warning message:
IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.