I have a vector x
of type Eigen::VectorXi
with more than 2^31-1 entries, which I would like to return to R. I can do that by copying x
entry-wisely to a new vector of type Rcpp::IntegerVector
, but that seems to be quite slow.
I am wondering:
Rcpp::wrap(x)
doesn't work.test.cpp
#include <RcppEigen.h>
// [[Rcpp::depends(RcppEigen)]]
// [[Rcpp::export]]
SEXP foo(const R_xlen_t size) {
Eigen::VectorXi x(size);
for (R_xlen_t i = 0; i < size; i++) {
x(i) = 1;
}
return Rcpp::wrap(x);
}
// [[Rcpp::export]]
Rcpp::IntegerVector fooSlow(const R_xlen_t size) {
Eigen::VectorXi x(size);
for (R_xlen_t i = 0; i < size; i++) {
x(i) = 1;
}
Rcpp::IntegerVector y(size);
for (R_xlen_t i = 0; i < size; i++) {
y(i) = x(i);
}
return y;
}
test.R
Rcpp::sourceCpp("./test.cpp")
a <- foo(2^(31)) # Error in foo(2^(31)) : negative length vectors are not allowed
a <- fooSlow(2^(31)) # runs fine but it's slow
Rcpp::wrap
is dispatching to a method for Eigen matrices and vectors implemented in RcppEigen
. That method doesn't appear to support long vectors, currently. (Edit: It now does; see below.)
The error about negative length is thrown by allocVector3
here. It arises when allocVector3
is called with a negative value for its argument length
. My guess is that Rcpp::wrap
tries to represent 2^31
as an int
, resulting in integer overflow. Maybe this happens here?
In any case, you seem to have stumbled on a bug, so you might consider sharing your example with the RcppEigen
maintainers on GitHub. (Edit: Never mind - I've just submitted a patch.) (Edit: Patched now, if you'd like to build RcppEigen
from sources [commit 5fd125e or later] in order to update your Rcpp::wrap
.)
Attempting to answer your first question, I compared your two approaches with my own based on std::memcpy
. The std::memcpy
approach supports long vectors and is only slightly slower than Rcpp::wrap
.
std::memcpy
approachThe C arrays beneath Eigen::VectorXi x
and Rcpp::IntegerVector y
have the same type (int
) and length (n
), so they contain the same number of bytes. You can use std::memcpy
to copy that number of bytes from one's memory address to other's without a for
loop. The hard part is knowing how to obtain the addresses. Eigen::VectorXi
has a member function data
that returns the address of the underlying int
array. R objects of integer type use INTEGER
from the R API, which does the same thing.
Rcpp::sourceCpp(code = '
#include <RcppEigen.h>
#include <Rinternals.h>
// [[Rcpp::depends(RcppEigen)]]
// [[Rcpp::export]]
Rcpp::IntegerVector f_for(const R_xlen_t n) {
Eigen::VectorXi x(n);
for (R_xlen_t i = 0; i < n; ++i) {
x(i) = i % 10;
}
Rcpp::IntegerVector y(n);
for (R_xlen_t i = 0; i < n; ++i) {
y(i) = x(i);
}
return y;
}
// [[Rcpp::export]]
Rcpp::IntegerVector f_wrap(const R_xlen_t n) {
Eigen::VectorXi x(n);
for (R_xlen_t i = 0; i < n; ++i) {
x(i) = i % 10;
}
return Rcpp::wrap(x);
}
// [[Rcpp::export]]
Rcpp::IntegerVector f_memcpy(const R_xlen_t n) {
Eigen::VectorXi x(n);
for (R_xlen_t i = 0; i < n; ++i) {
x(i) = i % 10;
}
Rcpp::IntegerVector y(n);
memcpy(INTEGER(y), x.data(), n * sizeof(int));
return y;
}
')
n <- 100L
x <- rep_len(0:9, n)
identical(f_for(n), x) # TRUE
identical(f_wrap(n), x) # TRUE
identical(f_memcpy(n), x) # TRUE
b <- function(n) microbenchmark::microbenchmark(f_for(n), f_wrap(n), f_memcpy(n), setup = gc(FALSE))
b(2^10)
## Unit: microseconds
## expr min lq mean median uq max neval
## f_for(n) 6.806 8.5280 15.09497 10.332 11.8900 461.496 100
## f_wrap(n) 4.469 6.2115 12.60750 8.569 9.7170 435.420 100
## f_memcpy(n) 4.633 7.0520 13.64193 9.061 9.6965 465.924 100
b(2^20)
## Unit: microseconds
## expr min lq mean median uq max neval
## f_for(n) 3094.106 3118.2960 3160.2501 3132.4205 3171.329 3515.996 100
## f_wrap(n) 864.690 890.0485 912.7006 905.4440 929.593 988.797 100
## f_memcpy(n) 940.048 971.6590 1001.9805 987.3825 1009.195 1428.235 100
b(2^30)
## Unit: seconds
## expr min lq mean median uq max neval
## f_for(n) 3.527164 3.554672 3.575698 3.573021 3.593006 3.693711 100
## f_wrap(n) 1.119750 1.133130 1.143425 1.139702 1.149030 1.203602 100
## f_memcpy(n) 1.304877 1.330994 1.343253 1.339099 1.354286 1.422912 100