I am searching for a set of edges in an igraph built from an osmar object and would like to change the weight of these. Since my graph is quite big, this task takes quite a long time. Since I run this function in a loop the runtime grows even bigger.
Is there a way I could optimize this?
Here is the code:
library(osmar)
library(igraph)
library(tidyr)
library(dplyr)
### Get data ----
src <- osmsource_api(url = "https://api.openstreetmap.org/api/0.6/")
muc_bbox <- center_bbox(11.575278, 48.137222, 1000, 1000)
muc <- get_osm(muc_bbox, src)
### Reduce to highways: ----
hways <- subset(muc, way_ids = find(muc, way(tags(k == "highway"))))
hways <- find(hways, way(tags(k == "name")))
hways <- find_down(muc, way(hways))
hways <- subset(muc, ids = hways)
#### Plot data ----
## Plot complete data and highways on top:
plot(muc)
plot_ways(muc, col = "lightgrey")
plot_ways(hways, col = "coral", add = TRUE)
### Define route start and end nodes: ----
id<-find(muc, node(tags(v %agrep% "Sendlinger Tor")))[1]
hway_start_node <-find_nearest_node(muc, id, way(tags(k == "highway")))
hway_start <- subset(muc, node(hway_start_node))
id <- find(muc, node(attrs(lon > 11.58 & lat > 48.15)))[1]
hway_end_node <- find_nearest_node(muc, id, way(tags(k == "highway")))
hway_end <- subset(muc, node(hway_end_node))
## Add the route start and and nodes to the plot:
plot_nodes(hway_start, add = TRUE, col = "red", pch = 19, cex = 2)
plot_nodes(hway_end, add = TRUE, col = "red", pch = 19, cex = 2)
### Create street graph ----
gr <- as.undirected(as_igraph(hways))
### Compute shortest route: ----
# Calculate route
route <- function(start_node,end_node) {
get.shortest.paths(gr,
from = as.character(start_node),
to = as.character(end_node),
mode = "all")[[1]][[1]]}
# Plot route
plot.route <- function(r,color) {
r.nodes.names <- as.numeric(V(gr)[r]$name)
r.ways <- subset(hways, ids = osmar::find_up(hways, node(r.nodes.names)))
plot_ways(r.ways, add = TRUE, col = color, lwd = 2)
}
nways <- 1
numway <- 1
r <- route(hway_start_node,hway_end_node)
# Plot route
color <- colorRampPalette(c("springgreen","royalblue"))(nways)[numway]
plot.route(r,color)
## Route details ----
# Construct a new osmar object containing only elements
# related to the nodes defining the route:
route_nodes <- as.numeric(V(gr)[r]$name)
route_ids <- find_up(hways, node(route_nodes))
osmar.route <- subset(hways, ids = route_ids)
osmar.nodes.ids <- osmar.route$nodes$attrs$id
# Extract the nodes’ coordinates,
osmar.nodes.coords <- osmar.route$nodes$attrs[, c("lon", "lat")]
osmar.nodes <- cbind(data.frame(ids = osmar.nodes.ids),
data.frame(ids_igraph = as.numeric(V(gr)[r]) ),
osmar.nodes.coords)
## Find edges ids containing points of interest ----
wished.coords <- data.frame(wlon = c(11.57631),
wlat = c(48.14016))
# Calculate all distances
distances <- crossing(osmar.nodes,wished.coords) %>%
mutate(dist = geosphere::distHaversine(cbind(lon,lat),cbind(wlon,wlat)))
# Select nodes below maximum distance :
mindist <- 50 #m
wished.nodes <- distances %>% filter(dist < mindist)
# Select edges incident to these nodes :
selected.edges <- unlist(incident_edges(gr,V(gr)[wished.nodes$ids_igraph]))
This is where the slowdown occurs: Weight of selected edges, change it by multiplying it with 10
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
This is where the slowdown occurs: Weight of selected edges, change it by multiplying it with 10
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
Maybe I could use a hashmap?
UPDATE
hashmap
Unit: seconds
Hashmap:
expr min lq mean median uq max neval
Hashmap 3.248543 3.289474 3.472038 3.324417 3.734050 4.188924 100
Without 3.267549 3.333012 3.557179 3.367015 3.776429 5.643784 100
Sadly it does not seemt to bring a lot of improvement.
library(hashmap)
#https://github.com/nathan-russell/hashmap
H <- hashmap(E(gr)[selected.edges],E(gr)[selected.edges]$weight)
sapply(H$find(E(grr)[selected.edges]), function(x) x * 10)
UPDATE: According to igraph doc, igraph is thread safe so I could use parallel.
Im currently trying this:
no_cores <- detectCores(logical = FALSE)
data <- split(selected.edges,factor(sort(rank(selected.edges)%%no_cores)))
c_result <- mclapply(1:no_cores, function(x) {
E(gr)[unlist(data[[x]])]$weight * 1000 / mean_value }, mc.cores = no_cores)
E(gr)[unlist(data)]$weight<-unlist(c_result)
I wonder why do I have to do the "writing step" outside of parallel loop. As I was trying to write the weight back to igraph within the loop it did not work ie weight did not get updated.
Thank you in advance! BR
As demonstrated in Advanced R, implementation performance in R can greatly vary depending on the syntax.
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
is a valid syntax, but it can also be formulated otherwise:
set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))
So let's compare both solutions :
microbenchmark::microbenchmark(
ref={E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10},
new={set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))})
Unit: microseconds
expr min lq mean median uq max neval cld
ref 15920.404 16567.788 17793.4412 17111.583 18491.685 25867.477 100 b
new 246.974 266.462 296.5088 278.769 292.718 662.974 100 a
@Andreas, can you please check on a bigger dataset if this could be a solution to your problem?