Consider the following code snippet,
#include <iostream>
#include <valarray>
using namespace std;
std::ostream & operator<<(std::ostream & out, const std::valarray<int> inputVector);
typedef std::valarray<std::valarray<int> > val2d;
int main()
{
val2d g(std::valarray<int>(10),4);
for (uint32_t n=0; n<4; ++n){
for (uint32_t m=0; m<10; ++m){
g[n][m] = n*10 + m;
}
}
std::valarray<int> g_slice_rs = g[1][std::slice(0,10,1)]; // row slice
//std::valarray<int> g_slice_cs = g[std::slice(0,1,3)][0]; // column slice (comment out)
cout<<"row slice :: "<<g_slice_rs<<endl;
//cout<<"column slice :: "<<g_slice_cs<<endl; // (comment out)
return 0;
}
std::ostream & operator<<(std::ostream & out, const std::valarray<int> inputVector)
{
uint32_t vecLength = inputVector.size();
out<<"[";
for (uint32_t i=0; i<vecLength; ++i)
{
out <<inputVector[i]<<", ";
}
out<<"]"<<endl;
return out;
}
Here I'm able to access the row slices, but not the column slices(as indicated in comments). Is there any workaround to access column slices? This thread does not provide the answer.
First off, you don't have a 2D valarray
. You have a valarray
of valarray
s, a difference you should not ignore.
x = g[m][n];
only looks like an array-style access. It's really closer to
temp = g[m];
x = temp[n];
A valarray
's datastore is a nice contiguous block of memory, but if you have an M by N structure, you have M+1 valarray
s potentially scattered throughout memory. This can turn into a nightmare of performance-killing cache misses.
You are going to have to decide which is more important to be fast, row slicing or column slicing, because only one will be going with the flow of memory and the other require a cache-thrashing copy against the grain.
Currently
g[1][std::slice(0,10,1)];
works because it is slicing a contiguous block of memory, and
g[std::slice(0,1,3)][0]
fails because it must reach across M distinct valarray
s to gather the slice and std::slice
can't do that. You will have to manually copy the elements you want from each of the valarray
s that make up the column. Sucks, huh?
So what do you do?
You fake it! Muhuhahahahahahahaha!
Don't make a valarray
of valarray
s. Make one big valarray
of size MxN. So say goodbye to
std::valarray<std::valarray<int> > g(std::valarray<int>(10),4);
and hello to
std::valarray<int>(10*4);
Now you can take advantage of std::slice
's stride parameter to grab every tenth element
std::slice(column_to_slice,4,10);
And as an added bonus you now have one contiguous block of memory so at least some of that cache-grinding abuse should be mitigated. You're still smurfed if the stride is too large.
I whole-heartedly recommend wrapping this in an object to make access and management easier. Something like this, except you use the valarray
instead of the raw pointer.