c++opencvcudaorb

OpenCV CUDA ORB gives run-time error. What's the root cause and how to fix it?


I'm trying to find ORB features using OpenCV. When I pass an image directly, I notice that features tend to be located around the central region of the image. To avoid that, I divided the image into grids, treating each grid as a small image, and then found the features in each grid i.e. each small image. That way, the features are not just limited to the central area.

Now, to speed things up, I decided to use OpenCV's CUDA implementation for ORB. Conveniently enough, its initialization is the same as the non-CUDA version:

static Ptr<ORB> cv::cuda::ORB::create (int   nfeatures = 500,
                                       float scaleFactor = 1.2f,
                                       int   nlevels = 8,
                                       int   edgeThreshold = 31,
                                       int   firstLevel = 0,
                                       int   WTA_K = 2,
                                       int   scoreType = ORB::HARRIS_SCORE,
                                       int   patchSize = 31,
                                       int   fastThreshold = 20,
                                       bool  blurForDescriptor = false 
                                      )

So, I was able to use the same parameter values. Also, I didn't have to make many changes in the code where I was finding the features:

for(ractangel_roi : list_of_rectangle_rois) {
    cv::Mat cpu_patch = big_image(rectangle_roi);
    cv::cuda::GpuMat gpu_patch;
    gpu_patch.upload(cpu_patch);
    orb_cuda->detectAndCompute(gpu_patch, cv::cuda::GpuMat(), key_point_vector, cv::cuda::GpuMat());
}

I don't get any compilation error and it runs fine for the first few frames. But then I always get this run-time error:

terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.x.x) /tmp/opencv/modules/core/src/matrix_wrap.cpp:1659: error: (-215:Assertion failed) !fixedSize() in function 'release'

Aborted (core dumped

I tried playing with different image sequences. But I get the same error after certain frames.

I was suspecting it might be running out of GPU (CUDA) memory. So, I put a code to check the total and free CUDA memory after each iteration. But it remains the same from the beginning until it gives the run-time error.

What is the root cause, and how to fix this error?

Note: If I don't divide the image into grids and convert an entire image into cv::cuda::GpuMat and pass that to orb_cuda then I don't get the aforementioned error.

Edit 1: I have multiple different datasets and this error occurs in all of them at a certain frame. I mean frame numbers vary from dataset to dataset but they remain consistent when I run the code again and again. As it happens for multiple different frames (and the datasets are not public), I can't share them here.


Solution

  • I found the solution (but not 100% sure why it works).

    In my application, I don't need to worry about the feature descriptors, so earlier when I was using the detectAndCompute function, I was passing empty GpuMat:

        orb_cuda->detectAndCompute(gpu_patch, cv::cuda::GpuMat(), key_point_vector, cv::cuda::GpuMat());
    

    As per the OpenCV documentation,

    virtual void cv::Feature2D::detectAndCompute (
        InputArray                image,
        InputArray                mask,
        std::vector< KeyPoint > & keypoints,
        OutputArray               descriptors,
        bool                      useProvidedKeypoints = false 
    )   
    

    So, instead of passing cv::cuda::GpuMat(), if I create a temporary instance of cv::cuda::GpuMat() and pass that object, I don't get the error:

        cv::cuda::GpuMat gpu_descriptor_mat;
        orb_cuda->detectAndCompute(gpu_patch, cv::cuda::GpuMat(), key_point_vector, gpu_descriptor_mat);
    

    The only thing I can think of why the above solution works is that for the descriptors, the datatype is OutputArray. That's why I need to pass an actual object of cv::cuda::GpuMat() instead of an empty object. But again, I'm not 100% sure about that. If someone has a better insight and you could share it here, would greatly appreciate it.