c++opencvpointersopenmp

OpenMp with OpenCv and pointers


I'm working on a C++ program which use OpenCv to analyze a video from a WebCam and do some motion tracking (ultimate goal trying to build an automated airsoft sentry turret for a school project!)

I'm trying a lot to do some optimization on my processing to get the highest frame rate possible while analyzing the video. I tried to use OpenMp to do parrallel processing, but i have hard time putting it in place with my code. Here one loop i'd like to use Openmp on.

    Mat differenceImage(frame1.size(), CV_8UC1);
    long long* pf1 = reinterpret_cast<long long*>(grayImage1.ptr());
    long long* pf2 = reinterpret_cast<long long*>(grayImage2.ptr());
    long long* pf3 = reinterpret_cast<long long*>(differenceImage.ptr());
    long long* pfe = pf1 + grayImage1.size().width*grayImage1.size().height   * sizeof(uchar) / 8;

    long long  a, b, r1, r2, r3, r4, r5, r6, r7, r8, s1, s2, s3, s4, s5, s6, s7, s8, t1, t2, t3, t4, t5, t6, t7, t8;

    while (pf1 < pfe) {
       a = *pf1;
       b = *pf2;

       s1 = a & 0xFF00000000000000 >> 56;
       s2 = a & 0x00FF000000000000 >> 48;
       s3 = a & 0x0000FF0000000000 >> 40;
       s4 = a & 0x000000FF00000000 >> 32;
       s5 = a & 0x00000000FF000000 >> 24;
       s6 = a & 0x0000000000FF0000 >> 16;
       s7 = a & 0x000000000000FF00 >> 8;
       s8 = a & 0x00000000000000FF;

       t1 = b & 0xFF00000000000000 >> 56;
       t2 = b & 0x00FF000000000000 >> 48;
       t3 = b & 0x0000FF0000000000 >> 40;
       t4 = b & 0x000000FF00000000 >> 32;
       t5 = b & 0x00000000FF000000 >> 24;
       t6 = b & 0x0000000000FF0000 >> 16;
       t7 = b & 0x000000000000FF00 >> 8;
       t8 = b & 0x00000000000000FF;


       r1 = s1 - t1;
       r2 = s2 - t2;
       r3 = s3 - t3;
       r4 = s4 - t4;
       r5 = s5 - t5;
       r6 = s6 - t6;
       r7 = s7 - t7;
       r8 = s8 - t8;

       if (r1 < 0) r1 = -r1;
       if (r2 < 0) r2 = -r2;
       if (r3 < 0) r3 = -r3;
       if (r4 < 0) r4 = -r4;
       if (r5 < 0) r5 = -r5;
       if (r6 < 0) r6 = -r6;
       if (r7 < 0) r7 = -r7;
       if (r8 < 0) r8 = -r8;

      *pf3 = (r1 << 56) | (r2 << 48) | (r3 << 40) | (r4 << 32) | (r5 << 24) | (r6 << 16) | (r7 << 8) | r8;

       ++pf1;
       ++pf2;
       ++pf3;
   }

Basically, I'm taking 2 frames into Mat image and I'm getting the difference between those 2 images. I tried to use OpenCv on that loop but without success, I tried to change my while for a "for" loop to use "#pragma omp parallel for" on that loop, but it isn't working at all.

Can anyone give me some advise on using Openmp in that case? Do you think it'll help improve performance?

Thank you


Solution

  • This all seems overly complicated for a problem that looks simple enough... Why not going back to a straightforward approach that can easily be both parallelized and vectorized?

    I'm not too sure about the type of your data, but I'd go for something like this:

    long long nbElem = grayImage1.size().width * grayImage1.size().height;
    unsigned char *pf1 = grayImage1.ptr();
    unsigned char *pf2 = grayImage2.ptr();
    unsigned char *pf3 = differenceImage.ptr();
    
    #pragma omp parallel for simd
    for ( long long i = 0; i < nbElem; i++ ) {
         pf3[i] = pf1[i] > pf2[i] ? pf1[i] - pf2[i] : pf2[i] - pf1[i];
    }
    

    Normally (not tested) this way, the compiler should generate a parallelized version of a vectorized version of your initial code, and it gains a lot of readability and maintainability.