I am completely new to geospatial analysis and am at a lost as to how to start. I would like to know what the sex ratio of Vietnam was in 2000, broken down by grid square per https://hub.worldpop.org/geodata/summary?id=11943.
How do I download and read this data in R, and extract the male and female population per grid square of .tiff files?
Furthermore, how would I be able to aggregate the data to find the sex ratio of 20 year holds per district?
Thank you very much.
Using terra
, download the ages you want to compare. This will be slow as server appears throttled. Using 2000, female/male
library(terra)
vnm_f_2k_20 <- rast('~/Downloads/vnm_f_20_2000.tif')
vnm_m_2k_20 <- rast('~/Downloads/vnm_m_20_2000.tif')
# plot(vnm_m_2k_20) you could but not much to look at
vnm_20_ratio <- vnm_f_2k_20/vnm_m_2k_20 # female/male
writeRaster(vnm_20_ratio, '~/Downloads/vnm_20_2k_fm_ratio.tif')
# source again
vnm_20_ration <- rast('~/Downloads/vnm_20_2k_fm_ratio.tif')
plot(vnm_20_ratio)
# import some (bad) shape files
viet_shp_vect <- vect('~/viet/Boundary.shp')
viet_shp_vect
class : SpatVector
geometry : lines
dimensions : 4588, 3 (geometries, attributes)
extent : 102.1421, 116.9473, 6.95331, 23.39391 (xmin, xmax, ymin, ymax)
source : Boundary.shp
coord. ref. : lon/lat WGS 84 (EPSG:4326)
names : gid Code Type
type : <int> <chr> <chr>
values : 1 BA010 Coastline
24 BA010 Coastline
52 BA010 Coastline
# we want polygons
viet_shp_vect_poly <- as.polygons(viet_shp_vect)
viet_shp_vect_poly
class : SpatVector
geometry : polygons
dimensions : 882, 3 (geometries, attributes)
extent : 103.418, 116.9473, 6.95331, 21.51663 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
names : gid Code Type
type : <int> <chr> <chr>
values : 1 BA010 Coastline
24 BA010 Coastline
52 BA010 Coastline
# okay, fewer
# which biggest? to see on vnm_20_ratio
viet_poly_expanse = expanse(viet_terra_poly, unit = 'km') # km^2
viet_terra_poly[which(viet_poly_expanse > 125)]
class : SpatVector
geometry : polygons
dimensions : 3, 3 (geometries, attributes)
extent : 103.8319, 107.6092, 10.00636, 21.27627 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat WGS 84 (EPSG:4326)
names : gid Code Type
type : <int> <chr> <chr>
values : 843 BA010 Coastline
4020 AC02 District boundary
287 BA010 Coastline
# get their extent(s)
ext(viet_terra_poly[which(viet_poly_expanse > 125)][1])
SpatExtent : 107.377707546894, 107.609243224099, 21.0418956379434, 21.2762675716601 (xmin, xmax, ymin, ymax)
ext(viet_terra_poly[which(viet_poly_expanse > 125)][2])
SpatExtent : 103.831937267764, 104.079663940524, 10.0063610568506, 10.4522918854626 (xmin, xmax, ymin, ymax)
ext(viet_terra_poly[which(viet_poly_expanse > 125)][3])
SpatExtent : 106.915154049455, 107.108447974102, 20.7130312182932, 20.8800654844975 (xmin, xmax, ymin, ymax)
first_unk_viet_visible = crop(vnm_20_ratio,ext(viet_terra_poly[which(viet_poly_expanse > 125)][1])
plot(first_unk_viet_visible)
You have shapefiles, I have geojson and we both need to figure out which level of detail, at the provincial level, or district level and all the little Spratlies that that entails.
For your consideration of which level to apply your analysis, you really should look back to the worldpop
data and what level it was collected at. This is the concept of 'support' in spatial analysis. If worldpop is provincial, you should use provincial, if district, then district. Hopefully your gadm is better organized than mine where I was scratching around for something that would be visible, having say names of recognizable places. The quick tour.
If you just need values by geographic area
median(extract(vnm_20_ratio, viet_terra_poly[which(viet_poly_expanse > 125)][1])[, 2], na.rm = TRUE)
[1] 0.8670955