Suppose I have a dataframe like this:
ID sp1 sp2 sp3
1 NA 1 1
2 0 0 1
3 1 NA 0
4 1 1 1
Here is what I wanted to get:
ID 1 2 3 4
1 2 1 0 2
2 1 1 0 1
3 0 0 1 1
4 2 1 1 3
which shows the number of times two columns have the same value 1
here.
As the original dataframe is quite large, I hope to find a efficient way to address this.
Thank you very much for any efforts.
In order to create a co-occurrence matrix from your data, you first need to convert your NA
s into 0s, then do a cross-product of your data without the first ID
column:
x = data.frame(ID = c(1:4), sp1 = c(NA,0,1,1), sp2 = c(1,0,NA,1), sp3 = c(1,1,0,1))
x[is.na(x)] = 0
crossprod(t(x[-1]))
[,1] [,2] [,3] [,4]
[1,] 2 1 0 2
[2,] 1 1 0 1
[3,] 0 0 1 1
[4,] 2 1 1 3