I'm using the geonames package to request country names, which I can do manually, but I don't understand how to make the API call for each row in my table.
loc2$country = GNcountryCode(loc2$lon, loc2$lat)$countryCode
My intention is to create a new column "country" populated with the corresponding code, but it appears to simple concatenate all the latitudes & longitudes into a single request.
I apologise for the basic nature of the question. I have no experience with R at all. I can't figure out how to make the function call work.
Here's row 1:
loc2[1,]
latitudeE7 longitudeE7 accuracy activity source deviceTag
1 375800672 1268884670 22 ON_BICYCLE, ON_FOOT, IN_VEHICLE, UNKNOWN, 34, 30, 21, 13, 2014-01-24T10:12:51.748Z WIFI 1521681206
timestamp velocity altitude verticalAccuracy platformType serverTimestamp deviceTimestamp batteryCharging formFactor heading
1 2014-01-24T10:12:50.011Z NA NA NA <NA> <NA> <NA> NA <NA> NA
deviceDesignation lat lon day
1 <NA> 37.58007 126.8885 2014-01-24
Background:
Officialdom requires me to identify which countries I've visited, and duration, over the last 10 years. I travel a lot, often by different modes there/back/on to another country, even on foot or by bicycle, so I don't have comprehensive formal documentation (like tickets) that contain this information.
I've never used R but after a bit of reading I thought it would be simplest to analyse my Google Location History (although I often keep airplane mode enabled to prolong battery life, so even that's not comprehensive, but a start...)
I have a data table with the JSON data downloaded, and have reduced the number of rows by a factor of 500x by selecting only unique days. The geonames site allows 1000 calls per hour.
Yes, I know, some (sensible) people will ask, if even I have no idea where I've been, why would I need to compile this data? I could confabulate a plausible fiction but this has become an obsession in itself. I haven't done any computer work at all for over 10 years so I'm struggling a bit.
Geonames countrycode API does not support batch requests so you can only include a single coordinate pair in each call. You could handle this through
mapply()
-- define a function that takes 2 arguments (lat, lon) and extracts countryCode
from the response, use it as a first argument for mapply()
; pass lat
& lon
vectors as 2nd and 3rd argument, mapply
will cycle through each lat-lon pair, calls the function and returns a vector with results:
library(geonames)
# example locations:
loc2
#> lon lat
#> 1 -84.41688 77.88553
#> 2 -46.03540 -14.01990
#> 3 146.95480 59.73224
#> 4 -116.43957 47.22695
#> 5 60.64802 26.29448
loc2$country_gn <-
withr::with_options(
list(geonamesUsername=YOUR_GEONAMES_USERNAME),
mapply(\(lat, lon) GNcountryCode(lat, lon)$countryCode, loc2$lat, loc2$lon)
)
loc2
#> lon lat country_gn
#> 1 -84.41688 77.88553 CA
#> 2 -46.03540 -14.01990 BR
#> 3 146.95480 59.73224 RU
#> 4 -116.43957 47.22695 US
#> 5 60.64802 26.29448 IR
Though you could just handle this without any external API: fetch a dataset of country polygons (e.g. through giscoR
for CISCO datasets or rnaturalearth
for https://www.naturalearthdata.com/ data) and use a spatial join provided by sf
package to find matches for your point locations:
library(sf)
library(giscoR)
# CNTR_RG_20M_2016_4326 dataset
world <- gisco_countries
# for high(er) resolution dataset from 2024:
# world <- gisco_get_countries(year = "2024", resolution = "01")
# convert loc2 to a spatial data frame;
# spatial join with world[, "CNTR_ID"] to match each loc2 location to a country polygon;
# extract CNTR_ID column;
loc2$country_cisco <-
st_join(
st_as_sf(loc2, coords = c("lon", "lat"), crs = "WGS84"),
world[, "CNTR_ID"]
)$CNTR_ID
loc2
#> lon lat country_gn country_cisco
#> 1 -84.41688 77.88553 CA CA
#> 2 -46.03540 -14.01990 BR BR
#> 3 146.95480 59.73224 RU RU
#> 4 -116.43957 47.22695 US US
#> 5 60.64802 26.29448 IR IR
Note that different geospatial datasets might take a different approach when it comes to labelling some areas, something to consider when you have to deal with location like Crimea or Northern Cyprus. This applies to reverse geocoding APIs as well.
Example locations:
set.seed(1)
loc2 <-
sf::st_sample(giscoR::gisco_countries, 5) |>
sf::st_coordinates() |>
`colnames<-`(c("lon", "lat")) |>
as.data.frame()
loc2 <- structure(list(lon = c(-84.4168839239306, -46.0353998519945,
146.954795316042, -116.439570855129, 60.6480190645839), lat = c(77.8855269367845,
-14.0199015626225, 59.7322442944768, 47.226945417709, 26.2944803596838
)), class = "data.frame", row.names = c(NA, -5L))
Created on 2024-10-09 with reprex v2.1.1