rggplot2mapsr-sp

How do I get points (Lat. & Long.) to show up on my shapefile map?


I am trying to plot gun violence occurrence coordinate points onto a shapefile map of New York. I applied a Google script to turn the original dataset street addresses into latitude and longitude coordinates. I exported it .csv into RStudio. Cleaned up the data a little more by removing unnecessary columns and NA values + changing the Lat. column to numeric.

I seem to have done everything right up until layering the points atop the shapefile map. When I run the following code, it returns the coordinate points separately from the map (see images attached). That is, they are not layered together so I can eventually use it to create a choropleth map. Also, the point image below doesn't seem to show all the latitude/longitude coordinates in the dataset. There are a total of 500 or so incidences with coordinate data provided spread out across the entire state of New York. I am not so confident in what is being shown, but that is probably a topic for another question.

library(data.table)
library(sp)
library(rgdal)
library(ggplot2)


df_2 <- Gun_Violence_Clean_3 %>%
  select(-`Incident ID`, -Operations, -`City 2`, -Combine, -`Lat & Long`) %>%
  na.omit()
df_2

df_2$Lat <- as.numeric(df_2$Lat)

coordinates(df_2) = c("Lat","Long")
crs.geo1 = CRS("+proj=longlat")  
proj4string(df_2) = crs.geo1  

plot(df_2, pch = 20, col = "steelblue")


New_York = readOGR(dsn = "./NYS Boundaries", layer = "new-york-state-city-and-town- 
boundaries")
plot(New_York)
points(df_2, pch = 20, col = "orange")

Reproducible data:

structure(list(`Incident ID` = c(1664753, 1664770, 1664768, 1664751, 
1664723, 1664721), `Incident Date` = c("23-Apr-20", "22-Apr-20", 
"22-Apr-20", "22-Apr-20", "22-Apr-20", "22-Apr-20"), State = c("New 
York", 
"New York", "New York", "New York", "New York", "New York"), 
`City Or County` = c("Buffalo", "Schenectady", "Schenectady", 
"Albany", "Brooklyn", "Corona (Queens)"), Address = c("50 block of 
Langmeyer Ave", 
"1009 McClyman St", "1013 McClyman St", "200 block of Second Ave", 
"255 Havemeyer St", "225-37 Murdock Ave"), `# Killed` = c(1, 
0, 0, 0, 0, 0), `# Injured` = c(0, 0, 1, 1, 0, 1), Operations = 
c("N/A", 
"N/A", "N/A", "N/A", "N/A", "N/A"), `City 2` = c("Buffalo", 
"Schenectady", "Schenectady", "Albany", "Brooklyn", "Corona (Queens)"
), Combine = c("50 block of Langmeyer Ave Buffalo", "1009 McClyman St 
Schenectady", 
"1013 McClyman St Schenectady", "200 block of Second Ave Albany", 
"255 Havemeyer St Brooklyn", "225-37 Murdock Ave Corona (Queens)"
), `Lat & Long` = c("42.92484, -78.815534", "#ERROR!", 
"42.80176729999999, -73.9331919", 
"42.6390962, -73.77026289999999", "40.7079479, -73.95942509999999", 
"40.703342, -73.731005"), Lat = c("42.92484000000", "#ERROR!", 
"42.80176730000", "42.63909620000", "40.70794790000", "40.70334200000"
), Long = c(-78.815534, NA, -73.9331919, -73.7702629, -73.9594251, 
-73.731005)), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

Points Map of New York

Dataset Screenshot

Where I got the shapefile of New York State

Thank you very much for your help.


Solution

  • Using this answer as a guide: out of bounds latitude and longitude values in converted shape file using ggplot

    Here is a bare minimum solution:

    library(rgdal)
    library(sp)
    
    df_2 <- structure(list(`Incident Date` = c("23-Apr-20", "22-Apr-20", 
    "22-Apr-20", "22-Apr-20", "22-Apr-20"), `City Or County` = c("Buffalo", 
    "Schenectady", "Albany", "Brooklyn", "Corona (Queens)"), `# Killed` = c(1, 
    0, 0, 0, 0), `# Injured` = c(0, 1, 1, 0, 1), Lat = c(42.92484, 
    42.8017673, 42.6390962, 40.7079479, 40.703342), Long = c(-78.815534, 
    -73.9331919, -73.7702629, -73.9594251, -73.731005)), row.names = c(NA, 
    -5L), na.action = structure(c(`2` = 2L), class = "omit"), class = c("tbl_df", 
    "tbl", "data.frame"))
    
    #read shape file
    New_York = readOGR(dsn = "./NYS Boundaries", layer = "Counties")
    
    #Transform the shape file coordinates to Lat/Longitude
    NY <-spTransform(New_York, CRS("+proj=longlat +ellps=WGS84 +datum=WGS84"))
    plot(NY, xlab="long")
    axis(1)  #Check the coordinates
    axis(2)
    #plot the points
    points(x=df_2$Long, y=df_2$Lat, pch = 20, col = "orange")
    

    enter image description here