rgeojsonspatialr-spogr

sp::over(). Does the dot belong to one of the polygons identified with an OGRGeoJSON file?


I'm trying to get a boolleans vector, where for example, v[i] =1 tells me if an i-th point (latitude longitude pair, present inside a train dataframe) falls within one of the geographical areas identified by an OGRGeoJSON file.

The OGR file is structured roughly like this:

That's what I tried to do.

However, the results obtained are not correct because the polygonal that is generated is a mix of all the various areas present in the OGR file.

library(rgdal)
library(httr)
library(sp)

r <- GET('https://data.cityofnewyork.us/api/geospatial/tqmj-j8zm?method=export&format=GeoJSON')
nyc_neighborhoods <- readOGR(content(r,'text'), 'OGRGeoJSON', verbose = F)

#New York City polygonal
pol_lat <- c(nyc_neighborhoods_df$lat)
pol_long <- c(nyc_neighborhoods_df$long)
xy <- cbind(pol_lat, pol_long)
p = Polygon(xy)
ps = Polygons(list(p),1)
pol = SpatialPolygons(list(ps))

#Points to analyse (pair of coordinates)
ny_lat <- c(train$pickup_latitude, train$dropoff_latitude)
ny_long <- c(train$pickup_longitude, train$dropoff_longitude)
ny_coord <- cbind(ny_lat, ny_long)
pts <- SpatialPoints(ny_coord)

#Query: Does the point to analyze fall in or out NYC?
over(pts, pol, returnList = TRUE)

How can I fix this to get the correct result?


Solution

  • sp is an older package which is being phased out in favor of the newer "Simple Features" sf package. Let me know if you are open to using the pipe operator %>% from the magrittr package, as it works nicely with the sf package (as does dplyr and purrr).

    Using sf, you could do:

    library(sf)
    
    # Replace this with the path to the geojson file
    geojson_path <- "path/to/file.geojson"
    
    boroughs <- sf::st_read(dsn = geojson_path, stringsAsFactors = FALSE)
    
    

    Now making a very simple spatial point object to stand in for the "trains" data.

    # Make test data.frame
    
    test_df <- 
      data.frame(
      # Random test point I chose, a couple of blocks from Central Park
          a = "manhattan_point", 
          y = 40.771959, 
          x = -73.964128, 
          stringsAsFactors = FALSE)
    
    # Turn the test_df into a spatial object
    test_point <-
      sf::st_as_sf(
        test_df,
        # The coords argument tells the st_as_sf function
        # what columns store the longitude and latitude data
        # which it uses to associate a spatial point to each
        # row in the data.frame
        coords = c("x", "y"), 
        crs = 4326 # WGS84
        )
    
    

    Now we are ready to determine what polygon(s) our point falls in:

    # Get the sparse binary predicate. This will give a list with as 
    # many elements as there are spatial objects in the first argument, 
    # in this case, test_point, which has 1 element.
    # It also has attributes which detail what the relationship is
    # (intersection, in our case) 
    sparse_bin_pred <- sf::st_intersects(test_point, boroughs)
    
    # Output the boro_name that matched. I think the package purrr
    # offers some more intuitive ways to do this, but
    lapply(
      sparse_bin_pred, 
      function(x) boroughs$boro_name[x]
      )
    
    

    That last part outputs:

    [[1]]
    [1] "Manhattan"