I have, what is to me, a rather complicated set of code to construct and I'd really appreciate some help figuring out the various pieces. Essentially, I have 200 different dataframes that will get turned into igraph objects and used for network analysis. The vertices in these networks are people and a person can appear in more than one network. So, I have about 500 pairs of networks and people. I've created some sample data of three networks and 6 different people below.
mat1<-matrix(data=c(0,4,3,2,0, 1, 3, 4 ,0),nrow=3, ncol=3)
colnames(mat1)<-c("Haley.A", "Carl.F", "Chris.P")
mat1_graph<-graph_from_adjacency_matrix(mat1,
weighted=TRUE, mode="directed", diag=FALSE)
mat2<-matrix(data=c(0,3,2,0),nrow=2, ncol=2)
colnames(mat2)<-c("Tom.D", "Chris.P")
mat2_graph<-graph_from_adjacency_matrix(mat2,
weighted=TRUE, mode="directed", diag=FALSE)
mat3<-matrix(data=c(0,3,2,1,3,0,5,2,4,4,0,2,1,1,4,0),nrow=4, ncol=4)
colnames(mat3)<-c("Tom.D", "Carl.F", "Charles.L", "Peter.Q")
mat3_graph<-graph_from_adjacency_matrix(mat3,
weighted=TRUE, mode="directed", diag=FALSE)
I have a data frame that I've created with every pairing of network name and person. *Note that in my actual data, the network names do NOT follow a naming pattern - they are very different names (unlike the example data of mat1, mat2, etc.)
So, in the case of the example data, the dataframe I've generated would look like:
all_pairs<-matrix(data=c("mat1", "Haley.A", "mat1","Carl.F", "mat1","Chris.P", "mat2", "Tom.D", "mat2","Chris.P", "mat3", "Tom.D", "mat3","Carl.F", "mat3","Charles.L", "mat3","Peter.Q"),nrow=9, ncol=2, byrow=TRUE)
colnames(all_pairs)<-c("network_name", "person_name")
all_pairs<-as.data.frame(all_pairs)
print(all_pairs)
network_name person_name
1 mat1 Haley.A
2 mat1 Carl.F
3 mat1 Chris.P
4 mat2 Tom.D
5 mat2 Chris.P
6 mat3 Tom.D
7 mat3 Carl.F
8 mat3 Charles.L
9 mat3 Peter.Q
What I would like is to run the degree igraph command on each network and then put the output into the appropriate row for the person and network name match-up.
So, theoretically if I ran igraph::degree(mat1_graph, mode=c("all"))
and receive this output
Haley.A Carl.F Chris.P
4 4 4
then I could create a column in the all_pairs dataframe and enter the information to look like this:
network_name person_name degree_all
1 mat1 Haley.A 4
2 mat1 Carl.F 4
3 mat1 Chris.P 4
4 mat2 Tom.D NA
5 mat2 Chris.P NA
6 mat3 Tom.D NA
7 mat3 Carl.F NA
8 mat3 Charles.L NA
9 mat3 Peter.Q NA
This would continue for each dataframe and network name/person pair. I'm struggling with how to set up code that would store these values from the igraph package into a dataframe based on the vertice name and dataframe name. I'll then need to do this for every dataset, which I was thinking of using lapply for, but am open to better ideas about as well.
Would greatly appreciate any advice and ideas about this puzzle! Thank you in advance.
Use merge to combine two dataframes horizontally.
library(igraph)
## creating dataframes, indexed by person, person
mat1<-matrix(data=c(0,4,3,2,0, 1, 3, 4 ,0),nrow=3, ncol=3)
colnames(mat1)<-c("Haley.A", "Carl.F", "Chris.P")
mat2<-matrix(data=c(0,3,2,0),nrow=2, ncol=2)
colnames(mat2)<-c("Tom.D", "Chris.P")
mat3<-matrix(data=c(0,3,2,1,3,0,5,2,4,4,0,2,1,1,4,0),nrow=4, ncol=4)
colnames(mat3)<-c("Tom.D", "Carl.F", "Charles.L", "Peter.Q")
## save in list
mats <- list(mat1=mat1, mat2=mat2, mat3=mat3)
## creating all pairs indexed by network, person
all_pairs<-matrix(data=c("mat1", "Haley.A", "mat1","Carl.F", "mat1","Chris.P",
"mat2", "Tom.D", "mat2","Chris.P",
"mat3", "Tom.D", "mat3","Carl.F", "mat3","Charles.L", "mat3","Peter.Q"), ncol=2, byrow=TRUE)
colnames(all_pairs)<-c("network_name", "person_name")
## convert dataframes to graphs
## calculate degrees and collect
all_dfs <- data.frame(matrix(nrow=0, ncol=3))
processed <- c(1,3)
for (i in processed) {
df <- mats[i]
ggg <- graph_from_adjacency_matrix(df[[1]], weighted=TRUE, mode="directed", diag=FALSE)
dd1 <- degree(ggg, mode=c("all"))
dd2 <- data.frame(cbind(names(df), names(dd1), dd1), row.names=NULL)
# all_dfs <- rbind(all_dfs, dd2) ## slow
all_dfs[nrow(all_dfs) + seq(nrow(dd2)), ] <- dd2 ## faster
}
colnames(all_dfs) <- c(colnames(all_pairs), "degree")
## comparing all_pairs with all processed dataframes
comp <- merge(x = all_pairs, y = all_dfs, by= c("network_name", "person_name"), all = TRUE)
comp
Gives.
network_name person_name degree
1 mat1 Carl.F 4
2 mat1 Chris.P 4
3 mat1 Haley.A 4
4 mat2 Chris.P <NA>
5 mat2 Tom.D <NA>
6 mat3 Carl.F 6
7 mat3 Charles.L 6
8 mat3 Peter.Q 6
9 mat3 Tom.D 6