I read in a KML file, using the following:
clinics = st_read(dsn = "Data/clinics-kml.kml","CLINICS")
However, all my variables (except for the coordinates) got lumped into 1 column under Description
(see below link).
What's the best way to separate the variables out? Alternatively, is there a way to import KML files properly to avoid this issue? You may view the screenshot of the problem here.
The problem (or maybe not) is that the Description column has an html table as a string for each observation. That is fine if you want to parse that html string and get a pretty table for example when creating an interactive web map. But it can be a headache if you just want the data inside.
So, it's possible to do all the process within R just following these steps:
All the code is commented, see below:
library(tidyverse)
library(sf)
library(mapview)
library(rvest)
library(httr)
# 1) Download the kml file
moh_chas_clinics <- GET("https://data.gov.sg/dataset/31e92629-980d-4672-af33-cec147c18102/download",
write_disk(here::here("moh_chas_clinics.zip"), overwrite = TRUE))
# 2) Unzip the downloaded zip file
unzip(here::here("moh_chas_clinics.zip"))
# 3) Read the KML file as a Spatial object
moh_chas_clinics <- read_sf(here::here("chas-clinics-kml.kml"))
# Watch data
moh_chas_clinics %>%
glimpse()
# See map
mapview(moh_chas_clinics)
# 4) Get the attributes for each observation
# Option a) Using a simple lapply
attributes <- lapply(X = 1:nrow(moh_chas_clinics),
FUN = function(x) {
moh_chas_clinics %>%
slice(x) %>%
pull(Description) %>%
read_html() %>%
html_node("table") %>%
html_table(header = TRUE, trim = TRUE, dec = ".", fill = TRUE) %>%
as_tibble(.name_repair = ~ make.names(c("Attribute", "Value"))) %>%
pivot_wider(names_from = Attribute, values_from = Value)
})
# Option b) Using a Parallel lapply (faster)
future::plan("multisession")
attributes <- future.apply::future_lapply(X = 1:nrow(moh_chas_clinics),
FUN = function(x) {
moh_chas_clinics %>%
slice(x) %>%
pull(Description) %>%
read_html() %>%
html_node("table") %>%
html_table(header = TRUE, trim = TRUE, dec = ".", fill = TRUE) %>%
as_tibble(.name_repair = ~ make.names(c("Attribute", "Value"))) %>%
pivot_wider(names_from = Attribute, values_from = Value)
})
# 5) Bind the attributes to each observation as new columns
moh_chas_clinics_attr <-
moh_chas_clinics %>%
bind_cols(bind_rows(attributes)) %>%
select(-Description)
# Watch new data
moh_chas_clinics_attr %>%
glimpse()
# New map
mapview(moh_chas_clinics_attr,
zcol = "CLINIC_PROGRAMME_CODE",
layer.name = "Clinic Programme Code")
A final map as an example showing all the attributes for a point and coloured by "Clinic Programme Code":