I am trying to make a convolution algorithm for grayscale bmp image. The below code is from Image processing course on Udemy, but the explanation about the variables and formula used was little short. The issue is in 2D discrete convolution part, im not able to understand the formula implemented here
struct Mask{
int Rows;
int Cols;
unsigned char *Data;
};
int main()
{
int imgWidth, imgHeight, imgBitDepth;
unsigned char imgHeader[BMP_HEADER_SIZE];
unsigned char imgColorTable[BMP_COLOR_TABLE_SIZE];
unsigned char imgBuffer[CUSTOM_IMG_SIZE];
unsigned char imgBuffer2[CUSTOM_IMG_SIZE];
const char imgName[] = "images/cameraman.bmp";
const char newImgName[] = "images/cameraman_new.bmp";
struct Mask lpMask;
signed char *tmp;
int i;
lpMask.Cols = lpMask.Rows = 5;
lpMask.Data = (unsigned char *)malloc(25);
/* -1 -1 -1 -1 -1
-1 -1 -1 -1 -1
-1 -1 24 -1 -1
-1 -1 -1 -1 -1
-1 -1 -1 -1 -1*/
//set all mask values to -1
tmp = (signed char *)lpMask.Data;
for (i = 0; i < 25; ++i)
{
*tmp = -1;
++tmp;
}
//set middle value to 24
tmp = (signed char *)lpMask.Data + 13;
*tmp = 24;
imageReader(imgName, &imgHeight, &imgWidth, &imgBitDepth, imgHeader, imgColorTable, imgBuffer);
Convolve(imgHeight, imgWidth, &lpMask, imgBuffer, imgBuffer2);
imageWriter(newImgName, imgHeader, imgColorTable, imgBuffer2, imgBitDepth);
printf("Success!\n");
return 0;
}
//2D Discrete Convolution
void Convolve(int imgRows, int imgCols, struct Mask *myMask, unsigned char *input_buf, unsigned char *output_buf)
{
long i, j, m, n, idx, jdx;
int ms, im, val;
unsigned char *tmp;
//outer summation loop - image
for (i = 0; i < imgRows; ++i)
//inner summation loop - image
for (j = 0; j < imgCols; ++j)
{
val = 0;
//outer summation loop - mask
for (m = 0; m < myMask->Rows; ++m)
//inner summation loop - mask
for (n = 0; n < myMask->Cols; ++n)
{
//Issue in understanding below part
ms = (signed char)*(myMask->Data + m * myMask->Rows + n);
// index of input img, used for checking boundary
idx = i - m;
jdx = j - n;
if (idx >= 0 && jdx >= 0) //ignore input samples which are out of bound
im = *(input_buf + idx * imgRows + jdx);
val += ms * im;
}
//truncate values to remain inside 0to255 range
if (val > 255) val = 255;
if (val < 0) val = 0;
tmp = output_buf + i * imgRows + j;
*tmp = (unsigned char)val;
}
}
Here in 3 lines, the formula used is similar and most difficult to understand its implementation, if possible please help out with understanding these codes logic or what they are doing exactly:
ms = (signed char)*(myMask->Data + m * myMask->Rows + n);
im = *(input_buf + idx * imgRows + jdx);
tmp = output_buf + i * imgRows + j;
For formula/pseudocode used, check Convolution section on following website:- https://en.wikipedia.org/wiki/Kernel_(image_processing)
OR
g(x,y) = ∑k= -n2 to n2 ∑j= -m2 to m2 h(j,k) * f(x-j, y-k) , where m2 = half of mask's width & n2 = half of mask's height
OR
The expressions you ask about are simply the computation of a location of particular pixel indexed in 2 dimensions (row, column), stored in a flat memory buffer.
For example, ms = (signed char)*(myMask->Data + m * myMask->Rows + n);
start with the mask image data buffer itself, myMask->Data
, which is a pointer. The first row of data shows up first, followed by the second row. So to access the pixel at row m, column n, you first have to skip m rows of data, which is the size of a row * m. Then you have to skip n pixels inside the row. Once the location of the pixel is computed, it is dereferenced with *.
The only complaint I have for this example code is the name myMask->Rows
. In this case, m represents a row index, and to compute the offset, it is multiplied by the size of a row, which should be the number of columns in the image, not the number of rows. So that reference should instead be myMask->Cols
.