I have a question regarding conversion of rasters with categorical values from WGS84 to Mollweide projections. It looks like to conversion leads to alteration of the dataset values. Very unfortunately, I struggle to provide you with a reproducible example, so I’ll provide you with some details about my approach. You may have some tips on where my issue could come from, as this may be a common issue. The EU website https://ghsl.jrc.ec.europa.eu/download.php?ds=bu gives me access to the following rasters:
SMOD layer provides me with human settlement info (SMOD dataset which I transformed into 1) a urban mask with “1” for urban areas and “NA” for non urban areas 2) a rural mask with “1” for rural areas and “NA” for non rural areas). SMOD is available in Mollweide projections only.
POP layer provides me with human population density (number of people per grid cell). POP is available in both Mollweide and WGS84. I tried two approaches to estimate rural and urban human population numbers.
My challenge is that I get different numbers for each of these approaches. I wonder why this is the case:
Approach 1) Change SMOD Mollweide projections to WGS84
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_WGSproj <- raster("./GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.tif")
SMOD_WGSproj <- projectRaster(from=SMOD_MollweideProj, to= POP_WGSproj, method='ngb' , over=T )
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_WGSproj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_WGSproj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_WGSproj <- POP_WGSproj * SMOD_rur_mask_1K
POP_urb_1K_WGSproj <- POP_WGSproj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_WGSproj, sum, na.rm=T)
#2024108119 which is different to the value I get with the second approach
cellStats(POP_urb_1K_WGSproj, sum, na.rm=T)
#5321638069 which is different to the value I get with the second approach
> SMOD_MollweideProj
class : RasterLayer
dimensions : 18000, 36082, 649476000 (nrow, ncol, ncell)
resolution : 1000, 1000 (x, y)
extent : -18041000, 18041000, -9e+06, 9e+06 (xmin, xmax, ymin, ymax)
crs : +proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs
values : 10, 30 (min, max)
> SMOD_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 10, 30 (min, max)
> POP_WGSproj
class : RasterLayer
dimensions : 21600, 43200, 933120000 (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333 (x, y)
extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
values : 0, 459434.6 (min, max)
Approach 2) Work with SMOD and POP Mollweide projections
# layers as provided by website
SMOD_MollweideProj <- raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif")
POP_MollweideProj <- raster("./GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0.tif")
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural domain".
SMOD_rur_mask_1K <- SMOD_MollweideProj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] = 1
SMOD_urb_mask_1K <- SMOD_MollweideProj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_MollweideProj <- POP_MollweideProj * SMOD_rur_mask_1K
POP_urb_1K_MollweideProj <- POP_MollweideProj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_MollweideProj, sum, na.rm=T)
# 1726372189 which is different to the value I get with the first approach
cellStats(POP_urb_1K_MollweideProj, sum, na.rm=T)
# 5622956252 which is different to the value I get with the first approach
Thank you very much for your suggestions
Following on from the comments above, here's my suggested code for your Approach 2, i.e. working with Mollweide projection for both SMOD and POP. I downloaded data for a single cell rather than the global layer, simply to reduce execution time.
Note I've used raster::mask()
to mask POP to urban & rural. With this method there is no need to set a value of 1
in the masks, you can just retain the original values after setting to NA
the cells you wish to mask. See ?raster::mask
.
library(raster)
smod <- raster("data/GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
pop <- raster("data/GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
# produce rural mask
maskRural <- smod
maskRural[maskRural > 14] <- NA
# produce urban mask
maskUrban <- smod
maskUrban[maskUrban < 20] <- NA
# use raster::mask to produce rural and urban population layers
popRural <- raster::mask(pop, maskRural)
popUrban <- raster::mask(pop, maskUrban)
# check that the total population of rural + urban is equal to the original
sum(pop[], na.rm = TRUE)
sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
The final couple of lines just check that the sum of rural and urban populations equals the original total population. So for the single cell I used the results are:
> sum(pop[], na.rm = TRUE)
[1] 67487835
> sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
[1] 67487835
To project to WGS84 you could do something like this:
popUrbanWgs84 <- projectRaster(popUrban, crs = crs("+init=epsg:4326"), method = "bilinear")
You can also specify a res
parameter here, otherwise it will choose one for you, but the question would be what resolution to use? The original resolution of 1km is roughly 0.008 degrees but in terms of longitude this varies across the globe.
My suggestion is to stick to Mollweide if at all possible.