rggplot2broomgeojsonio

Convert a geojson file to data.frame (from Eurostat) and make a map


I am trying to convert an Eurostat's geojson file here to a dataframe using packages geojsonio and broom, but when the file is converted into the dataframe using the broom::tidy() function many of the columns in the geojson file are not converted and when I create a map with ggplot the map is not correct. I need the geojson to be in a dataframe.

There's a way of getting geojson data converted into a dataframe from Eurostat with eurostat package, but the problem is that the eurostat function get_eurostat_geospatial() only gets map data for Europe and I need the world map - Eurostat has the world map, but is not retreivable using get_eurostat_geospatial().

My question is: how can I efficiently convert a geojson file into a dataframe and keep all features in the geojson?

This the format I need, based on get_eurostat_geospatial()function:

library(eurostat)

map = get_eurostat_geospatial(output_class = "df", resolution = 20, nuts_level = 0, year = 2016, crs = "4326")

# A tibble: 6,588 × 17
    long   lat order hole  piece group id    NUTS_ID LEVL_CODE CNTR_CODE NAME_LATN NUTS_NAME MOUNT_TYPE URBN_TYPE COAST_TYPE FID  
   <dbl> <dbl> <int> <lgl> <fct> <fct> <chr> <chr>       <int> <chr>     <chr>     <chr>          <int>     <int>      <int> <chr>
 1 -7.03  43.5     1 FALSE 1     1.1   1     ES              0 ES        ESPAÑA    ESPAÑA             0         0          0 ES   
 2 -6.29  43.6     2 FALSE 1     1.1   1     ES              0 ES        ESPAÑA    ESPAÑA             0         0          0 ES 

And his is the code I am using to convert the geojson to dataframe and produce the map without success:

library(tidyverse)
library(broom)
library(geojsonio)

file = "LOCATION OF THE GEOJSON FILE/CNTR_BN_20M_2020_4326.shp"

df = geojson_read(file, what = "sp")
df_tidy = tidy(df)    

ggplot() +
  geom_polygon(data = df_tidy, aes(x=long, y=lat, group=group))

enter image description here


Solution

  • For geospatial data you'd probably want sf, ggplot supports it through geom_sf() layer. sf object can be handled as regular data.frames (joining / filtering / grouping / mutating with or without dplyr) and it supports both GeoJSON and Shapefile among other geospatial file formats. For fetching world map from Eurostat you can use giscoR package, it also provides gisco_get_nuts() for NUTS regions. Or use sf::st_read() directly on your shp and geojson files.

    library(sf)
    library(ggplot2)
    library(tibble)
    
    # /vsicurl/ enables caching, feel free to use it with plain https://.. URIs
    nuts <- st_read("/vsicurl/https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_BN_60M_2021_4326.geojson")
    as_tibble(nuts)
    #> # A tibble: 5,074 × 9
    #>    EU_FLAG EFTA_FLAG CC_FLAG LEVL_CODE NUTS_BN_ID COAS_FLAG OTHR_FLAG   FID
    #>    <chr>   <chr>     <chr>       <int>      <int> <chr>     <chr>     <int>
    #>  1 T       F         F               0        155 T         F           155
    #>  2 T       F         F               3        156 F         F           156
    #>  3 T       F         F               3        157 F         F           157
    #>  4 T       F         F               0        158 T         F           158
    #>  5 T       F         F               0        159 T         F           159
    #>  6 T       F         F               3        160 F         F           160
    #>  7 F       F         T               1        161 F         F           161
    #>  8 T       F         F               3        162 F         F           162
    #>  9 T       F         F               3        163 F         F           163
    #> 10 F       F         T               0        164 F         T           164
    #> # … with 5,064 more rows, and 1 more variable: geometry <LINESTRING [°]>
    
    p_nuts <- ggplot(nuts) +
      geom_sf(linewidth = .1) +
      labs(caption = "NUTS_BN_60M_2021_4326.geojson") +
      theme_bw()
    
    
    # countries from giscoR 
    world <- giscoR::gisco_get_countries()
    tibble::as_tibble(world)
    #> # A tibble: 257 × 6
    #>    CNTR_ID NAME_ENGL            ISO3_C…¹ CNTR_…² FID                    geometry
    #>    <chr>   <chr>                <chr>    <chr>   <chr>            <GEOMETRY [°]>
    #>  1 AR      Argentina            ARG      Argent… AR    MULTIPOLYGON (((-62.6452…
    #>  2 AS      American Samoa       ASM      Americ… AS    MULTIPOLYGON (((-170.628…
    #>  3 AT      Austria              AUT      Österr… AT    POLYGON ((16.94028 48.61…
    #>  4 AQ      Antarctica           ATA      Antarc… AQ    MULTIPOLYGON (((-57.1760…
    #>  5 AD      Andorra              AND      Andorra AD    POLYGON ((1.7258 42.5044…
    #>  6 AE      United Arab Emirates ARE      الإمار… AE    MULTIPOLYGON (((56.37424…
    #>  7 AF      Afghanistan          AFG      افغانس… AF    POLYGON ((74.88986 37.23…
    #>  8 AG      Antigua and Barbuda  ATG      Antigu… AG    MULTIPOLYGON (((-61.6902…
    #>  9 AI      Anguilla             AIA      Anguil… AI    POLYGON ((-63.09693 18.1…
    #> 10 AL      Albania              ALB      Shqipë… AL    POLYGON ((20.0763 42.555…
    #> # … with 247 more rows, and abbreviated variable names ¹​ISO3_CODE, ²​CNTR_NAME
    
    p_world <- ggplot(world) +
      geom_sf() +
      labs(caption = "giscoR::gisco_get_countries()") +
      theme_bw()
    
    patchwork::wrap_plots(p_nuts, p_world)
    

    Created on 2023-02-23 with reprex v2.0.2