So, I have a list of multiple tidygraph
objects and what Im trying to do is return the index a specific tidygraph
object, selected by the user. Hopefully my example below will explain the problem.
(ASIDE: I have attempted a solution that I show below, but at the moment it is super slow to run. Im hoping to come up with a different, faster solution.)
To begin, I create some data to turn into tidygraph
objects, then I create the tidygraph
objects and put them all together into a list:
library(tidygraph)
# create some data for the tbl_graph
nodes <- data.frame(name = c("Hadley", "David", "Romain", "Julia"),
level = c(1,1,1,1),
rank = c(1,1,1,1))
nodes1 <- data.frame(name = c("Hadley", "David", "Romain", "Julia"),
level = c(1,1,1,1),
rank = c(2,2,2,2))
nodes2 <- data.frame(name = c("Hadley", "David", "Romain", "Julia"),
level = c(1,1,1,1),
rank = c(3,3,3,3))
nodes3 <- data.frame(name = c("Hadley", "David", "Romain", "Julia"),
level = c(2,2,2,2),
rank = c(1,1,1,1))
edges <- data.frame(from = c(1, 1, 1, 2, 3, 3, 4, 4, 4),
to = c(2, 3, 4, 1, 1, 2, 1, 2, 3))
# create the tbl_graphs
tg <- tbl_graph(nodes = nodes, edges = edges)
tg_1 <- tbl_graph(nodes = nodes1, edges = edges)
tg_2 <- tbl_graph(nodes = nodes2, edges = edges)
tg_3 <- tbl_graph(nodes = nodes3, edges = edges)
# put into list
myList <- list(tg, tg_1, tg_2, tg_3)
For clarity, looking at the 1st list element looks like this:
> myList[1]
[[1]]
# A tbl_graph: 4 nodes and 9 edges
#
# A directed simple graph with 1 component
#
# Node Data: 4 × 3 (active)
name level rank
<chr> <dbl> <dbl>
1 Hadley 1 1
2 David 1 1
3 Romain 1 1
4 Julia 1 1
#
# Edge Data: 9 × 2
from to
<int> <int>
1 1 2
2 1 3
3 1 4
# … with 6 more rows
We can see that each object has a variable called level
and another called rank
. What Im trying to do is return the list index of an object by selecting the level
and rank
number. So, for example, if I select level = 1
and rank = 2
, my function would return the index of the object with those values (in this case the 2nd list element). My attempted solution to this is below, but it's a very slow process... I was wondering if there is a better way to achieve what I want?
My Attempted Solution
In my solution, I begin by turning each of the tidygraph
objects in a tibble to make them easier to manipulate. And this is why my function is so slow. In my data, I could have up to 200,000 tidygraph
objects in a list, so going through them and converting them all to tibbles is a very slow process. I do that like so:
# seperating out the list to make it easier to manipulate
list_obj <- lapply(myList, function(x){
edges <- tidygraph::activate(x, edges) %>% tibble::as_tibble()
nodes <- tidygraph::activate(x, nodes) %>% tibble::as_tibble()
return(list(edges = edges, nodes = nodes))
} )
And then this is the function I actually use to extract the index of the chosen object:
# this function returns the tree index asked for by user
getTreeListNumber <- function(listObj, level, rank){
res <- 0
listNumber <- NA
for(i in 1:length(listObj)){
res <- level %in% listObj[[i]]$nodes$level && rank %in% listObj[[i]]$nodes$rank
if(res == TRUE){
listNumber <- i
}
}
return(listNumber)
}
For example:
> getTreeListNumber(list_obj, level = 1, rank = 2)
[1] 2
By selecting the level and rank, the function return the objects index within the list. But is there a faster way to achieve this result?
You may try -
getTreeListNumber <- function(listObj, level, rank){
which(sapply(myList, function(x) {
nodes <- tidygraph::activate(x, nodes) %>% tibble::as_tibble()
all(nodes$level == level & nodes$rank == rank)
}))
}
getTreeListNumber(myList, 1, 2)
#[1] 2