rigraphnetwork-analysis

Take igraph output and match in new dataset by vertice and dataset name


I have, what is to me, a rather complicated set of code to construct and I'd really appreciate some help figuring out the various pieces. Essentially, I have 200 different dataframes that will get turned into igraph objects and used for network analysis. The vertices in these networks are people and a person can appear in more than one network. So, I have about 500 pairs of networks and people. I've created some sample data of three networks and 6 different people below.

mat1<-matrix(data=c(0,4,3,2,0, 1, 3, 4 ,0),nrow=3, ncol=3)
colnames(mat1)<-c("Haley.A", "Carl.F", "Chris.P")
mat1_graph<-graph_from_adjacency_matrix(mat1,
                                    weighted=TRUE, mode="directed", diag=FALSE)

mat2<-matrix(data=c(0,3,2,0),nrow=2, ncol=2)
colnames(mat2)<-c("Tom.D", "Chris.P")
mat2_graph<-graph_from_adjacency_matrix(mat2,
                                        weighted=TRUE, mode="directed", diag=FALSE)

mat3<-matrix(data=c(0,3,2,1,3,0,5,2,4,4,0,2,1,1,4,0),nrow=4, ncol=4)
colnames(mat3)<-c("Tom.D", "Carl.F", "Charles.L", "Peter.Q")
mat3_graph<-graph_from_adjacency_matrix(mat3,
                                        weighted=TRUE, mode="directed", diag=FALSE)

I have a data frame that I've created with every pairing of network name and person. *Note that in my actual data, the network names do NOT follow a naming pattern - they are very different names (unlike the example data of mat1, mat2, etc.)

So, in the case of the example data, the dataframe I've generated would look like:

all_pairs<-matrix(data=c("mat1", "Haley.A", "mat1","Carl.F", "mat1","Chris.P", "mat2", "Tom.D", "mat2","Chris.P", "mat3", "Tom.D", "mat3","Carl.F", "mat3","Charles.L", "mat3","Peter.Q"),nrow=9, ncol=2, byrow=TRUE)
colnames(all_pairs)<-c("network_name", "person_name")
all_pairs<-as.data.frame(all_pairs)

print(all_pairs)
  network_name person_name
1         mat1     Haley.A
2         mat1      Carl.F
3         mat1     Chris.P
4         mat2       Tom.D
5         mat2     Chris.P
6         mat3       Tom.D
7         mat3      Carl.F
8         mat3   Charles.L
9         mat3     Peter.Q

What I would like is to run the degree igraph command on each network and then put the output into the appropriate row for the person and network name match-up.

So, theoretically if I ran igraph::degree(mat1_graph, mode=c("all")) and receive this output

Haley.A  Carl.F Chris.P 
     4       4       4 

then I could create a column in the all_pairs dataframe and enter the information to look like this:

   network_name person_name  degree_all
1          mat1     Haley.A    4
2          mat1      Carl.F    4
3          mat1     Chris.P    4
4          mat2       Tom.D    NA
5          mat2     Chris.P    NA
6          mat3       Tom.D    NA
7          mat3      Carl.F    NA
8          mat3   Charles.L    NA
9          mat3     Peter.Q    NA

This would continue for each dataframe and network name/person pair. I'm struggling with how to set up code that would store these values from the igraph package into a dataframe based on the vertice name and dataframe name. I'll then need to do this for every dataset, which I was thinking of using lapply for, but am open to better ideas about as well.

Would greatly appreciate any advice and ideas about this puzzle! Thank you in advance.


Solution

  • Use merge to combine two dataframes horizontally.

    library(igraph)
    
    ## creating dataframes, indexed by person, person
    mat1<-matrix(data=c(0,4,3,2,0, 1, 3, 4 ,0),nrow=3, ncol=3)
    colnames(mat1)<-c("Haley.A", "Carl.F", "Chris.P")
    mat2<-matrix(data=c(0,3,2,0),nrow=2, ncol=2)
    colnames(mat2)<-c("Tom.D", "Chris.P")
    mat3<-matrix(data=c(0,3,2,1,3,0,5,2,4,4,0,2,1,1,4,0),nrow=4, ncol=4)
    colnames(mat3)<-c("Tom.D", "Carl.F", "Charles.L", "Peter.Q")
    
    ## save in list
    mats <- list(mat1=mat1, mat2=mat2, mat3=mat3)
    
    ## creating all pairs indexed by network, person
    all_pairs<-matrix(data=c("mat1", "Haley.A", "mat1","Carl.F", "mat1","Chris.P",
                             "mat2", "Tom.D", "mat2","Chris.P",
                             "mat3", "Tom.D", "mat3","Carl.F", "mat3","Charles.L", "mat3","Peter.Q"), ncol=2, byrow=TRUE)
    colnames(all_pairs)<-c("network_name", "person_name")
    
    ## convert dataframes to graphs
    ## calculate degrees and collect   
    all_dfs <- data.frame(matrix(nrow=0, ncol=3))
    processed <- c(1,3)
    for (i in processed) {
      df      <- mats[i]
      ggg     <- graph_from_adjacency_matrix(df[[1]], weighted=TRUE, mode="directed", diag=FALSE)
      dd1     <- degree(ggg, mode=c("all"))
      dd2     <- data.frame(cbind(names(df), names(dd1), dd1), row.names=NULL)
    
      # all_dfs <- rbind(all_dfs, dd2)                      ## slow
      all_dfs[nrow(all_dfs) + seq(nrow(dd2)), ] <- dd2      ## faster
    }
    colnames(all_dfs) <- c(colnames(all_pairs), "degree")
     
    ## comparing all_pairs with all processed dataframes
    comp <- merge(x = all_pairs, y = all_dfs, by= c("network_name", "person_name"), all = TRUE)
    comp
    

    Gives.

     network_name person_name degree
    1         mat1      Carl.F      4
    2         mat1     Chris.P      4
    3         mat1     Haley.A      4
    4         mat2     Chris.P   <NA>
    5         mat2       Tom.D   <NA>
    6         mat3      Carl.F      6
    7         mat3   Charles.L      6
    8         mat3     Peter.Q      6
    9         mat3       Tom.D      6