I have been trying to implement the meanshift algorithm for tracking objects, and have gone through the concepts involved.
As per now I have managed to successfully generate a backprojected stream from my camera with a single channel hue roi histogram and a single channel hue video stream which seems fine, I know there is a meanshift function within the opencv library but I am trying to implement one myself using the data structures provided in opencv, calculating the moments and computing the mean centroid of the search window.
But for some reason I am unable to locate the problem within my code as it keeps on converging to the upper left corner of my video stream for any input roi (region of interest) to be tracked. Following is a code snippet of the function for calculating the centroid of the search window where I feel the problem lies but not sure what it is, I would really appreciate if someone can point me in the right direction:
void moment(Mat &backproj, Rect &win){
int x_c, y_c, x_c_new, y_c_new;
int idx_row, idx_col;
double m00 = 0.0 , m01 = 0.0 , m10 = 0.0 ;
double res = 1.0, TOL = 0.003 ;
//Set the center of search window as the center of the probabilistic image:
y_c = (int) backproj.rows / 2 ;
x_c = (int) backproj.cols / 2 ;
//Centroid search solver until residual below certain tolerance:
while (res > TOL){
win.width = (int) 80;
win.height = (int) 60;
//First array element at position (x,y) "lower left corner" of the search window:
win.x = (int) (x_c - win.width / 2) ;
win.y = (int) (y_c - win.height / 2);
//Modulo correction since modulo of negative integer is negative in C:
if (win.x < 0)
win.x = win.x % backproj.cols + backproj.cols ;
if (win.y < 0)
win.y = win.y % backproj.rows + backproj.rows ;
for (int i = 0; i < win.height; i++ ){
//Traverse along y-axis (height) i.e. rows ensuring wrap around top/bottom boundaries:
idx_row = (win.y + i) % (int)backproj.rows ;
for (int j = 0; j < win.width; j++ ){
//Traverse along x-axis (width) i.e. cols ensuring wrap around left/right boundaries:
idx_col = (win.x + j) % (int)backproj.cols ;
//Compute Moments:
m00 += (double) backproj.at<uchar>(idx_row, idx_col) ;
m10 += (double) backproj.at<uchar>(idx_row, idx_col) * i ;
m01 += (double) backproj.at<uchar>(idx_row, idx_col) * j ;
}
}
//Compute new centroid coordinates of the search window:
x_c_new = (int) ( m10 / m00 ) ;
y_c_new = (int) ( m01 / m00 );
//Compute the residual:
res = sqrt( pow((x_c_new - x_c), 2.0) + pow((y_c_new - y_c), 2.0) ) ;
//Set new search window centroid coordinates:
x_c = x_c_new;
y_c = y_c_new;
}
}
It's my second ever query on stackoverflow so please excuse me for any guidelines that I forgot to follow.
EDIT
changed m00 , m01 , m10 to block level variables within WHILE-LOOP instead of function level variables, thanks to Daniel Strul for pointing it out but the problem still remains. Now the search window jumps around the frame boundaries instead of focusing on the roi.
void moment(Mat &backproj, Rect &win){
int x_c, y_c, x_c_new, y_c_new;
int idx_row, idx_col;
double m00 , m01 , m10 ;
double res = 1.0, TOL = 0.003 ;
//Set the center of search window as the center of the probabilistic image:
y_c = (int) backproj.rows / 2 ;
x_c = (int) backproj.cols / 2 ;
//Centroid search solver until residual below certain tolerance:
while (res > TOL){
m00 = 0.0 , m01 = 0.0 , m10 = 0.0
win.width = (int) 80;
win.height = (int) 60;
//First array element at position (x,y) "lower left corner" of the search window:
win.x = (int) (x_c - win.width / 2) ;
win.y = (int) (y_c - win.height / 2);
//Modulo correction since modulo of negative integer is negative in C:
if (win.x < 0)
win.x = win.x % backproj.cols + backproj.cols ;
if (win.y < 0)
win.y = win.y % backproj.rows + backproj.rows ;
for (int i = 0; i < win.height; i++ ){
//Traverse along y-axis (height) i.e. rows ensuring wrap around top/bottom boundaries:
idx_row = (win.y + i) % (int)backproj.rows ;
for (int j = 0; j < win.width; j++ ){
//Traverse along x-axis (width) i.e. cols ensuring wrap around left/right boundaries:
idx_col = (win.x + j) % (int)backproj.cols ;
//Compute Moments:
m00 += (double) backproj.at<uchar>(idx_row, idx_col) ;
m10 += (double) backproj.at<uchar>(idx_row, idx_col) * i ;
m01 += (double) backproj.at<uchar>(idx_row, idx_col) * j ;
}
}
//Compute new centroid coordinates of the search window:
x_c_new = (int) ( m10 / m00 ) ;
y_c_new = (int) ( m01 / m00 );
//Compute the residual:
res = sqrt( pow((x_c_new - x_c), 2.0) + pow((y_c_new - y_c), 2.0) ) ;
//Set new search window centroid coordinates:
x_c = x_c_new;
y_c = y_c_new;
}
}
The reason your algorithms always converges to the upper left corner independently of the input data is that m00
, m10
and m01
are never reset to zero:
m00
, m10
and m01
, you compute the right value m0m00
, m10
and m01
, you actually sum the new moment with the old one and obtain ( m0 + m1 )At the very least, the moment variables should be reset at the beginning of each iteration.
Ideally, they should not be function-level variables but should rather be block-level variables, as they have no use outside the loop iterations (except for debugging purpose):
while (res > TOL){
...
double m00 = 0.0, m01 = 0.0, m10 = 0.0;
for (int i = 0; i < win.height; i++ ){
...
EDIT 1
The reason for the second problem you encounter (the ROI jumping all around the place) is that the computations of the moments are based on the relative coordinates i
and j
.
Thus, what you compute is [ avg(j) , avg(i) ], wher as what you really want is [ avg(y) , avg(x) ]. To solve this issue, I had proposed a first solution. I"ve replaced it by a much simpler solution below.
EDIT 2 The simplest solution is to add the coordinates of the ROI corner right at the end of each iteration:
x_c_new = win.x + (int) ( m10 / m00 ) ;
y_c_new = win.y + (int) ( m01 / m00 );