rtidygraph

How to reorder a list of tidygraph objects based on a column in the list in R?


I have a list of tidygraph objects. I am trying to reorder the list elements based on a certain criteria. That is, each element of my list has a column called name. I am trying to group together the list elements that have identical name columns... but also I would like to group them in descending order of their count as well (i.e., the count of equal name columns in each list element). Hopefully my example will explain more clearly.

To begin, I create some data, turn them into tidygraph objects and put them together in a list:

library(tidygraph)
library(tidyr)

# create some node and edge data for the tbl_graph
nodes1 <- data.frame(
  name = c("x4", NA, NA),
  val = c(1, 5, 2)
)
nodes2 <- data.frame(
  name = c("x4", "x2", NA, NA, "x1", NA, NA),
  val = c(3, 2, 2, 1, 1, 2, 7)
)
nodes3 <- data.frame(
  name = c("x1", "x2", NA),
  val = c(7, 4, 2)
)
nodes4 <- nodes1
nodes5 <- nodes2
nodes6 <- nodes1

edges <- data.frame(from = c(1, 1), to = c(2, 3))
edges1 <- data.frame(
  from = c(1, 2, 2, 1, 5, 5),
  to = c(2, 3, 4, 5, 6, 7)
)

# create the tbl_graphs
tg_1 <- tbl_graph(nodes = nodes1, edges = edges)
tg_2 <- tbl_graph(nodes = nodes2, edges = edges1)
tg_3 <- tbl_graph(nodes = nodes3, edges = edges)
tg_4 <- tbl_graph(nodes = nodes4, edges = edges)
tg_5 <- tbl_graph(nodes = nodes5, edges = edges1)
tg_6 <- tbl_graph(nodes = nodes6, edges = edges)


# put into list
myList <- list(tg_1, tg_2, tg_3, tg_4, tg_5, tg_6) 

So, we can see that there are 6 tidygraph objects in myList.

Examining each element we can see that 3 objects have identical name columns (i.e., x4,NA,NA).... 2 objects have identical name columns ("x4", "x2", NA, NA, "x1", NA, NA).. and 1 object remains(x1,x2,NA).

Using a little function to get the counts of equal name columns:

# get a count of identical list elements based on `name` col
counts <- lapply(myList, function(x) {
  x %>%
    pull(name) %>%
    paste0(collapse = "")
}) %>%
  unlist(use.names = F) %>%
  as_tibble() %>%
  group_by(value) %>%
  mutate(val = n():1) %>% 
  slice(1) %>%
  arrange(-val)

Just for clarity:

> counts
# A tibble: 3 × 2
# Groups:   value [3]
  value                  val
  <chr>                <int>
1 x4 NA NA                 3
2 x4 x2 NA NA x1 NA NA     2
3 x1 x2 NA                 1

I would like to rearrange the order of list elements in myList based on the val column in my counts object.

My desired output would look something like this (which I am just manually reordering):

myList <- list(tg_1, tg_4, tg_6, tg_2, tg_5, tg_3)

Is there a way to automate the reordering of my list based on the count of identical name columns?

UPDATE:

So my attempted solution is to do the following:

ind <- map(myList, function(x){
  x %>%
    pull(name) %>%
    replace_na("..") %>%
    paste0(collapse = "")
}) %>%
  unlist(use.names = F) %>%
  as_tibble() %>%
  mutate(ids = 1:n()) %>%
  group_by(value) %>%
  mutate(val = n():1) %>% 
  arrange(value) %>% 
  pull(ids)

# return new list of trees
myListNew <- myList[ind]

The above code groups the list elements by the name column and returns an index called ind. I'm then indexing my original list by the ind index to rearrange my list.

However, I would still like to find a way to sort the new list based on the total amount of each identical name variable... I still haven't figured that out yet.


Solution

  • After hours of testing, I eventually have a working solution.

    ind <- map(myList, function(x){
      x %>%
        pull(name) %>%
        replace_na("..") %>%
        paste0(collapse = "")
    }) %>%
      unlist(use.names = F) %>%
      as_tibble() %>%
      mutate(ids = 1:n()) %>%
      group_by(value) %>%
      mutate(val = n():1) %>% 
      arrange(value) 
    
    ind <- ind %>%
      group_by(value) %>%
      mutate(valrank = min(ids)) %>%
      ungroup() %>%
      arrange(valrank, value, desc(val)) %>% 
      pull(ids)
    
    # return new list of trees
    myListNew <- myList[ind]
    

    The above code arranges the list by name alphabetically. Then I group by the name and create another column that ranks the row. I can then rearrange the rows based on this variable. Finally I index by the result.