rsnaggraphtidygraph

Eigenvector centrality by group using tidygraph in R


I'm working with network data and tidygraph in R. I'm trying to get yearly centrality measures without having to filter and merge, or iterate. Basically I want a node-year dataset that contains each nodes' eigenvector and degree centrality.

Tidygraph should do this, but I cannot find the proper way...

Below you find some code that reproduces my dataset and loads the packages

I'm working with R.Version 4.2.2 and tidygraph 1.2.3

Thanks!!

Livio

library(tidyverse)
library(manynet)
library(ggraph)
library(sna)
library(igraph)
library(tidygraph)

nodes <- tibble(
  node_id = 1:6,
  transaction_type = c("Purchase", "Deposit", "Withdrawal", "Transfer", "Payment", "Loan")
)

edges <- tibble(
  from = c(1, 2, 3, 2, 4, 4, 5, 6, 5),
  to = c(2, 4, 2, 3, 5, 6, 6, 4, 1),
  year = c(2019, 2020, 2019, 2018, 2021, 2020, 2022, 2021, 2018),
  amount = c(100, 500, 50, 200, 300, 150, 75, 1000, 250)
)

net <-graph_from_data_frame(d=edges, vertices=nodes, directed=T)

net %>% #one of my attempts...
  as_tbl_graph()%>%
  activate(nodes)%>%
  group_by(.E()$year)%>%
  mutate(eigen_centr=centrality_eigen(.E()))
as.tibble()

Solution

  • The structure of net is of a single graph representing all years, whereas you wish to measure the centrality of each node by year. As far as I know, the only two ways to do this are

    1. Construct a different graph per year
    2. Calculate the centrality for each node using weights according to year.

    Although the second option sounds closer to what you are trying to do, it's not that simple. It would require adding a new column in the nodes data for each year. The first option seems easier to me, and doesn't require explicit loops:

    cbind(nodes,
      lapply(split(edges, edges$year), graph_from_data_frame, vertices = nodes) |>
      lapply(\(x) x %>% 
               as_tbl_graph() %>%
               mutate(centrality = round(centrality_eigen(), 3)) %>%
               as_tibble() %>%
               pluck('centrality')) %>%
      as.data.frame(check.names = FALSE)
    )
    #>   node_id transaction_type 2018  2019  2020  2021 2022
    #> 1       1         Purchase    1 0.707 0.000 0.000    0
    #> 2       2          Deposit    1 1.000 0.707 0.000    0
    #> 3       3       Withdrawal    1 0.707 0.000 0.000    0
    #> 4       4         Transfer    0 0.000 1.000 1.000    0
    #> 5       5          Payment    1 0.000 0.000 0.707    1
    #> 6       6             Loan    0 0.000 0.707 0.707    1
    

    The second version could be something like this, but requires specifically naming each year, which seems more difficult and error prone.

    net %>% 
      as_tbl_graph() %>%
      mutate(`2018` = round(centrality_eigen(weights = year == 2018), 3),
             `2019` = round(centrality_eigen(weights = year == 2019), 3),
             `2020` = round(centrality_eigen(weights = year == 2020), 3),
             `2021` = round(centrality_eigen(weights = year == 2021), 3),
             `2022` = round(centrality_eigen(weights = year == 2022), 3))
    #> # A tbl_graph: 6 nodes and 9 edges
    #> #
    #> # A directed simple graph with 1 component
    #> #
    #> # A tibble: 6 x 7
    #>   name  transaction_type `2018` `2019` `2020` `2021` `2022`
    #>   <chr> <chr>             <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
    #> 1 1     Purchase              1  0.707  0      0          0
    #> 2 2     Deposit               1  1      0.707  0          0
    #> 3 3     Withdrawal            1  0.707  0      0          0
    #> 4 4     Transfer              0  0      1      1          0
    #> 5 5     Payment               1  0      0      0.707      1
    #> 6 6     Loan                  0  0      0.707  0.707      1
    #> #
    #> # A tibble: 9 x 4
    #>    from    to  year amount
    #>   <int> <int> <dbl>  <dbl>
    #> 1     1     2  2019    100
    #> 2     2     4  2020    500
    #> 3     3     2  2019     50
    #> # i 6 more rows
    #> # i Use `print(n = ...)` to see more rows