c++cudadirectx-11onnxonnxruntime

Using the ONNX Runtime c++ api, is there a way to convert a cudaArray to an Ort::Value without leaving the GPU?


I have obtained a cudaArray via the dx11 cuda interop, Stripped the alpha channel using a cuda kernel. So at this point I've done all my transforms without leaving the GPU. I would like to create an ORT value from that same object without pulling it down to CPU land.

Is this possible?


Solution

  • Yes that is possible when creating an Ort::Value from a raw resource (CUDA pointer or DX resource). The API that you are looking for is here. There is a C++ equivalent at Ort::Value::CreateTensor.

    For DX I have some sample code here, the sample also has CUDA code, but there allocations are done using the session allocator instead of providing external allocated memory. But when using the same API as linked above you can use the raw pointer you would get from cudaMalloc for which you can see this unitest.