I am trying to figure out the proper way to enable cub
in cupy
, but without success so far. I looked into the documentation and I couldn't find anything. At the moment I enable cub
like this:
import cupy.core._accelerator as _acc
_acc.set_routine_accelerators(['cub'])
_acc.set_reduction_accelerators(['cub'])
Before executing the above code, cub
is disabled. I confirm that by running:
cupy.core.get_reduction_accelerators()
cupy.core.get_routine_accelerators()
which return an empty list ([]
). After running the code in the first snippet the above functions return [1]
(whatever that means). Also, I can notice a massive performance difference in functions like cupy.nansum
.
As you can see though, the functions cupy.set_routine_accelerators
and cupy.set_reduction_accelerators
belong to a private API (cupy.core._accelerator
) which implies that I shouldn't call them.
cub
in cupy
?I am using Python 3.7.6
and cupy 8.1.0
Thank you
The documented way of doing it is through the CUPY_ACCELERATORS env var. export CUPY_ACCELERATORS=cub. docs.cupy.dev/en/stable/reference/environment.html As you noticed, set_*_accelerators is mostly a private API that we use for testing. The reason for them returning 1 is that we use an Enum, I agree it is confusing ... maybe we can change it :).