I need somehow to be able to compute the exponential of a matrix inside a CUDA kernel. Is there any library whose function for this task could be called from within CUDA thread? Or maybe would it be possible to implement this function from scratch as __device__
function?
I am using Microsoft Visual Studio 2008 Express for host code compilation and nvcc compiler from toolkit 3.2v.
GPU: NVIDIA GeForce GT640 (compute capability 3.0)
No there's no such things in CUDA library but you might look at this code to help you designing a solution in CUDA:
https://github.com/poliu2s/MKL/blob/master/matrix_exponential.cpp
If you are working on an architecture 3.5, it could be easier to solve your problem (with dynamic paralleslism) by calling a __global__
kernel from an other __global__
kernel without returning on the host so you can set the configuration you want to execute it (threads and blocks).
Basically:
__global__ child( ... )
{
....
}
__global__ parent( ... )
{
child<<< ..., ... >>>( ... )
}
Hope this can help