rcoordinatesrastermap-projectionswgs84

raster conversion from WGS84 to Mollweide alters values?


I have a question regarding conversion of rasters with categorical values from WGS84 to Mollweide projections. It looks like to conversion leads to alteration of the dataset values. Very unfortunately, I struggle to provide you with a reproducible example, so I’ll provide you with some details about my approach. You may have some tips on where my issue could come from, as this may be a common issue. The EU website https://ghsl.jrc.ec.europa.eu/download.php?ds=bu gives me access to the following rasters:

My challenge is that I get different numbers for each of these approaches. I wonder why this is the case:

Approach 1) Change SMOD Mollweide projections to WGS84

# layers as provided by website
SMOD_MollweideProj <-  raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif") 
POP_WGSproj <- raster("./GHS_POP_E2015_GLOBE_R2019A_4326_30ss_V1_0.tif") 
SMOD_WGSproj <- projectRaster(from=SMOD_MollweideProj, to= POP_WGSproj, method='ngb' , over=T )
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural  domain".
SMOD_rur_mask_1K <- SMOD_WGSproj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] =  1
SMOD_urb_mask_1K <- SMOD_WGSproj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_WGSproj <-  POP_WGSproj * SMOD_rur_mask_1K
POP_urb_1K_WGSproj <-  POP_WGSproj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_WGSproj, sum, na.rm=T) 
#2024108119 which is different to the value I get with the second approach
cellStats(POP_urb_1K_WGSproj, sum, na.rm=T)
#5321638069 which is different to the value I get with the second approach

> SMOD_MollweideProj
class      : RasterLayer 
dimensions : 18000, 36082, 649476000  (nrow, ncol, ncell)
resolution : 1000, 1000  (x, y)
extent     : -18041000, 18041000, -9e+06, 9e+06  (xmin, xmax, ymin, ymax)
crs        : +proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs 
values     : 10, 30  (min, max)

> SMOD_WGSproj
class      : RasterLayer 
dimensions : 21600, 43200, 933120000  (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333  (x, y)
extent     : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
values     : 10, 30  (min, max)

> POP_WGSproj
class      : RasterLayer 
dimensions : 21600, 43200, 933120000  (nrow, ncol, ncell)
resolution : 0.008333333, 0.008333333  (x, y)
extent     : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
values     : 0, 459434.6  (min, max)

Approach 2) Work with SMOD and POP Mollweide projections

# layers as provided by website
SMOD_MollweideProj <-  raster ("./GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0.tif") 
POP_MollweideProj <- raster("./GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0.tif") 
#create rural and urban masks - Classes 30-23-22-21 if aggregated form the "urban domain", 13-12-11-10 form the "rural  domain".
SMOD_rur_mask_1K <- SMOD_MollweideProj
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) >14] = NA
values(SMOD_rur_mask_1K)[values(SMOD_rur_mask_1K) <=13] =  1
SMOD_urb_mask_1K <- SMOD_MollweideProj
SMOD_urb_mask_1K[SMOD_urb_mask_1K<20 ] <- NA
SMOD_urb_mask_1K[SMOD_urb_mask_1K>=21 ] <- 1
#Generate rural and urban population layers, based on total population per grid cell and rural and urban masks
POP_rur_1K_MollweideProj <-  POP_MollweideProj * SMOD_rur_mask_1K
POP_urb_1K_MollweideProj <-  POP_MollweideProj * SMOD_urb_mask_1K
#urban and rural population estimates
cellStats(POP_rur_1K_MollweideProj, sum, na.rm=T)
# 1726372189 which is different to the value I get with the first approach
cellStats(POP_urb_1K_MollweideProj, sum, na.rm=T)
# 5622956252 which is different to the value I get with the first approach

Thank you very much for your suggestions


Solution

  • Following on from the comments above, here's my suggested code for your Approach 2, i.e. working with Mollweide projection for both SMOD and POP. I downloaded data for a single cell rather than the global layer, simply to reduce execution time.

    Note I've used raster::mask() to mask POP to urban & rural. With this method there is no need to set a value of 1 in the masks, you can just retain the original values after setting to NA the cells you wish to mask. See ?raster::mask.

    library(raster)
    
    smod <- raster("data/GHS_SMOD_POP2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
    pop <- raster("data/GHS_POP_E2015_GLOBE_R2019A_54009_1K_V1_0_11_4.tif")
    
    # produce rural mask
    maskRural <- smod
    maskRural[maskRural > 14] <- NA
    
    # produce urban mask
    maskUrban <- smod
    maskUrban[maskUrban < 20] <- NA
    
    # use raster::mask to produce rural and urban population layers
    popRural <- raster::mask(pop, maskRural)
    popUrban <- raster::mask(pop, maskUrban)
    
    # check that the total population of rural + urban is equal to the original
    sum(pop[], na.rm = TRUE)
    sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
    

    The final couple of lines just check that the sum of rural and urban populations equals the original total population. So for the single cell I used the results are:

    > sum(pop[], na.rm = TRUE)
    [1] 67487835
    > sum(popUrban[], na.rm = TRUE) + sum(popRural[], na.rm = TRUE)
    [1] 67487835
    

    To project to WGS84 you could do something like this:

    popUrbanWgs84 <- projectRaster(popUrban, crs = crs("+init=epsg:4326"), method = "bilinear")
    

    You can also specify a res parameter here, otherwise it will choose one for you, but the question would be what resolution to use? The original resolution of 1km is roughly 0.008 degrees but in terms of longitude this varies across the globe.

    My suggestion is to stick to Mollweide if at all possible.