c++openmprcpparmadillo

Problem in modifying shared armadillo matrix with OpenMP


I tried using omp in RcppArmadillo in the following toy example

#include <RcppArmadillo.h>
#include<omp.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::plugins(cpp11)]]
// [[Rcpp::plugins(openmp)]]
using namespace arma;

// [[Rcpp::export]]
mat mod_cube(unsigned nrun, unsigned d, unsigned nthr=1 ) {
  mat x(d,nrun );
  x.print();
#pragma omp parallel for shared(x) num_threads(nthr)
  for(unsigned run=0;run<x.n_cols;++run){
    Rcpp::Rcout<<"thread_id ="<<omp_get_thread_num()<<endl;
    (x.col(run )).fill((double) (run+1) );
  }
  return x;
}

i.e., I fill each column with a value in parallel. The code runs fine for nthr=1 but produces the following error if I set nthr=5 or above.

Error: C stack usage  589726373052 is too close to the limit
> 
 *** caught segfault ***
address 0x500004400, cause 'memory not mapped'

I am unable to find the reason since the column modifications are apparently independent tasks.

I know that this particular code can be written in various other ways. I need to use omp for a much more complicated script. I am trying to find if there's something obvious that I might be missing via this simple example.


Solution

  • As hinted in my comment, the Rcpp::Rcout inside the parallel loop does you in. It works for me both when I used std::cout (but the output is of course garbled) or when I simply remove it.

    I include a minorly-updated version of your code. We no longer need the C++11 plugin as (current) R defaults to newer versions anyway.

    Code

    #include <RcppArmadillo.h>
    #include <omp.h>
    
    // [[Rcpp::depends(RcppArmadillo)]]
    // [[Rcpp::plugins(openmp)]]
    
    // [[Rcpp::export]]
    arma::mat mod_cube(int nrun, int d, int nthr=1) {
        arma::mat x(d, nrun);
    #pragma omp parallel for shared(x) num_threads(nthr)
        for (size_t run = 0; run < nrun; run++) {
            x.col(run).fill((double) run+1);
        }
        return x;
    }
    
    /*** R
    mod_cube(10, 10, 10)
    */
    

    Output

    > Rcpp::sourceCpp("question.cpp")
    
    > mod_cube(10, 10, 10)
          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
     [1,]    1    2    3    4    5    6    7    8    9    10
     [2,]    1    2    3    4    5    6    7    8    9    10
     [3,]    1    2    3    4    5    6    7    8    9    10
     [4,]    1    2    3    4    5    6    7    8    9    10
     [5,]    1    2    3    4    5    6    7    8    9    10
     [6,]    1    2    3    4    5    6    7    8    9    10
     [7,]    1    2    3    4    5    6    7    8    9    10
     [8,]    1    2    3    4    5    6    7    8    9    10
     [9,]    1    2    3    4    5    6    7    8    9    10
    [10,]    1    2    3    4    5    6    7    8    9    10
    >