I'm processing RGB
images and doing the same thing for each channel (R+G+B) so I've been looking for parallel functions that could help me improve my code and run it (3*?) faster. Now Im using forEach
function like so:
unsigned char lut[256];
for (int i = 0; i < 256; i++)
lut[i] = cv::saturate_cast<uchar>(pow((float)(i / 255.0), fGamma) * 255.0f); //pow: power exponent
dst.forEach<cv::Vec3b> //dst is an RGB image
(
[&lut](cv::Vec3b &pixel, const int* po) -> void
{
pixel[0] = lut[(pixel[0])];
pixel[1] = lut[(pixel[1])];
pixel[2] = lut[(pixel[2])];
}
);
But when I use htop
to see the number of threads running, I only find one or two threads on..
Am I doing something wrong or forEach
isn't suppose to run on multi-threading
? Do you have any resources to help me with to get to multi-threading
computations?
I'm running my code on ubuntu with this:
g++ -std=c++1z -Wall -Ofast -march=native test3.cpp -o test3 `pkg-config --cflags --libs opencv`
Have you taken a look in TBB already? Threading Building Blocks is an apache licensed lib for parallel computing which you can use just compiling OpenCV with the flag -D WITH_TBB=ON
See this example of parallel_for: http://www.jayrambhia.com/blog/opencv-with-tbb
If you decide to adopt TBB follow these steps:
1 - Rebuild OpenCV with TBB support. If you are running into a Linux machine just do:
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON BUILD_TBB=ON ..
2 - Rewrite your program to use TBB
See the answers there: Simplest TBB example focusing in the most recent ones.