I am after little help trying to find the particular position of a value within a big.matrix object. Take the following matrix:
X <- as.big.matrix(matrix(1:30, 10, 3))
I want to find the row and column number where a particular value occurs, say
find_numb = 13
(this will be later used to subset against another matrix). On a standard matrix I do:
as.matrix(X) == find_numb #Convert big.matrix and find location of value
Which returns a TRUE/FALSE
matrix which is great.
Now, when I do the same on a big.matrix
X == find_numb
I get the following error:
Error in X == find_numb :
comparison (1) is possible only for atomic and list types
This seems like a simple problem but I do not fully understand the error (still learning R / programming in general), so I apologize for not understanding these atomic and list definitions to solve myself.
The above example is of course a simplified example: the actual matrix is around 500 GB (hence big.matrix), and I want to search through a vector of different numbers to find their individual locations e.g.
find_numb <- sample(1:10000, 2000).
I have the the mcapply function drafted to do this, just striking this issue when trying to find the initial locations of each value.
Thank you for any help and guidance
With R package {bigstatsr}, you can do:
library(bigstatsr)
X <- as_FBM(matrix(sample(30, 5000, replace = TRUE), 50, 100))
# tuto for big_apply(): https://privefl.github.io/bigstatsr/articles/big-apply.html
test <- big_apply(X, function(X, ind) {
res <- which(X[, ind, drop = FALSE] == 13, arr.ind = TRUE)
res[, 2] <- ind[res[, 2]]
res
}, a.combine = "rbind")
test2 <- which(X[] == 13, arr.ind = TRUE)
all.equal(test, test2)
Disclaimer: I'm the author of the package.