Following the tbb
example for parallel_reduce
here with a basic datatypes, I wanted to try and implement a tbb
version of a row sum of an Armadillo matrix. (I realize the use of arma::sum
for this purpose, but this was an attempt at getting to know tbb
, Armadillo
and how they work together a bit better. )
The following code was aimed to loop over columns of a matrix X
, and add the column values to a running total vector. It compiles fine but crashes at run time. I am wondering where I am going wrong. Is the X.col(j)
use is incorrect? Thank you!
arma::vec parallelRowsumTBB_withArma(arma::mat X){
arma::vec y=tbb::parallel_reduce(
tbb::blocked_range<size_t>(0,X.n_cols),
arma::vec(X.n_rows).fill(0),
[&](const tbb::blocked_range<size_t>& r, arma::vec runningTotal) {
for(int j=r.begin(); j!=r.end(); ++j){
runningTotal = runningTotal + X.col(j);
}
return runningTotal;
},
[](arma::vec a, arma::vec b){
return a + b;
}
);
return y;
}
That is a 'frequently asked question': you cannot have any parallel code have any callback or interaction with the main R process which is not safe for such callback. And Armadillo data structures provided by RcppArmadillo reuse (by default) the R memory so you can potentially have interactions with R's processs via the gc()
.
See the RcppParallel package (with relies on TBB) and its documentation for an alternative: RMatrix
and RVector
. Or ensure your RcppArmadillo objects are in fact distinct copies.