c++stlcudathruststream-compaction

CUDA Thrust How can I combine copy_if and transform without materialize data


Let's say we have two inputs, the first one is an array, and the second is a bitmap

thrust::device_vector<point_t> points;
Bitset bits; // Imagine this can be accessed within the kernel.

What I want to do is to copy valid points to an output vector.

thrust::device<point_t> output;

To access the Bitset, I need to give a number that is exactly the index of the points array.

The logic looks like this:

for(size_t i = 0; i < points.size(); i++) {
    if (bits.is_active(i)) {
        output.push_back(points[i]);
    }
}

I believe this can be achieved with the combination of copy_if, make_transform_iterator, make_zip_iterator, etc. But I need a predicate for copy_if to access the value before transforming it. How do I make it work?

    auto get_point =
        [] __device__(const thrust::tuple<size_t, point_t>& t) {
          return thrust::get<1>(t);
        };

    auto it1 = thrust::make_transform_iterator(
        thrust::make_zip_iterator(thrust::make_tuple(
            thrust::counting_iterator<size_t>(0), points.begin())),
        get_point);
    auto it2 = thrust::make_transform_iterator(
        thrust::make_zip_iterator(thrust::make_tuple(
            thrust::counting_iterator<size_t>(points.size()), points.end())),
        get_point);

    thrust::copy_if(
        thrust::cuda::par.on(stream), it1, it2,
        output.begin(),
        [=] __device__(const thrust::tuple<size_t, point_t>& t) {
          auto index = thrust::get<0>(t);

          return bits.is_active(index);
        });

Solution

  • Thanks to Abator, his solution works.

    auto end = thrust::copy_if(
        thrust::cuda::par.on(stream), 
        points.begin(), 
        points.end(),
        thrust::counting_iterator<size_t>(0),
        output.begin(),
        [=] __device__(size_t i) { return bits.is_active(i); }
    );