rigraph

Calculating the shortest distance between multiple points on a graph?


I have a random graph in R:

library(igraph)
library(tidyr)

set.seed(123)
g <- sample_gnm(n = 20, m = 20, directed = FALSE)

enter image description here

I am trying to find out the shortest distance between each node to each node.

I found this function in igraph called distances() and tried to implement this:

dist_df <- as.data.frame(distances(g)) %>%
    mutate(node_i = 1:nrow(.)) %>%
    pivot_longer(cols = -node_i,
                 names_to = "node_j",
                 values_to = "distance") %>%
    mutate(node_j = as.numeric(gsub("V", "", node_j))) %>%
    select(node_i, node_j, distance)

Manually checking the results, they seem to be correct:

# A tibble: 400 × 3
   node_i node_j distance
    <int>  <dbl>    <dbl>
 1      1      1        0
 2      1      2        2
 3      1      3        3
 4      1      4        4
 5      1      5        4
 6      1      6        3
 7      1      7      Inf
 8      1      8        3
 9      1      9      Inf
10      1     10      Inf

Here is my question: How is the distance() function able to work so fast? I thought finding out the shortest path in graphs is a very computationally intensive process - how is this function doing it so quickly? Is this the correct function for this problem?


Solution

  • If you are curious about "Why"s, I suggest you refer to the source code by typing ?distances (I am not an expert on graph theory, but hope the code below could give you some clues)

    > distances
    function (graph, v = V(graph), to = V(graph), mode = c("all", 
        "out", "in"), weights = NULL, algorithm = c("automatic", 
        "unweighted", "dijkstra", "bellman-ford", "johnson", "floyd-warshall")) 
    {
        ensure_igraph(graph)
        if (!is_directed(graph)) {
            mode <- "out"
        }
        v <- as_igraph_vs(graph, v)
        to <- as_igraph_vs(graph, to)
        mode <- igraph.match.arg(mode)
        mode <- switch(mode, out = 1, `in` = 2, all = 3)
        algorithm <- igraph.match.arg(algorithm)
        algorithm <- switch(algorithm, automatic = 0, unweighted = 1,
            dijkstra = 2, `bellman-ford` = 3, johnson = 4, `floyd-warshall` = 5)
        if (is.null(weights)) {
            if ("weight" %in% edge_attr_names(graph)) {
                weights <- as.numeric(E(graph)$weight)
            }
        }
        else {
            if (length(weights) == 1 && is.na(weights)) {
                weights <- NULL
            }
            else {
                weights <- as.numeric(weights)
            }
        }
        if (!is.null(weights) && algorithm == 1) {
            weights <- NULL
            warning("Unweighted algorithm chosen, weights ignored")
        }
        on.exit(.Call(R_igraph_finalizer))
        res <- .Call(R_igraph_shortest_paths, graph, v - 1, to -
            1, as.numeric(mode), weights, as.numeric(algorithm))
        if (igraph_opt("add.vertex.names") && is_named(graph)) {
            rownames(res) <- V(graph)$name[v]
            colnames(res) <- V(graph)$name[to]
        }
        res
    }
    

    where you can see there are several algorithm options, and their source code can be found https://github.com/igraph/igraph/tree/master/src/paths

    Maybe you should read about those files and and I believe you would conclude it by yourself.