I have implemented a PHOW features detector in matlab, as follows:
[frames, descrs] = vl_phow(im);
which is a wraper to the code:
...
for i = 1:4
ims = vl_imsmooth(im, scales(i) / 3) ;
[frames{s}, descrs{s}] = vl_dsift(ims, 'Fast', 'Step', step, 'Size', scales(i)) ;
end
...
I'm doing an implementation in c++ with opencv and vlfeat. This is part of my implementation code to calculate PHOW features for an image (Mat image):
...
//convert into float array
float* img_vec = im2single(image);
//create filter
VlDsiftFilter* vlf = vl_dsift_new(image.cols, image.rows);
double bin_sizes[] = { 3, 4, 5, 6 };
double magnif = 3;
double* scales = (double*)malloc(4*sizeof(double));
for (size_t i = 0; i < 4; i++)
{
scales[i] = bin_sizes[i] / magnif;
}
for (size_t i = 0; i < 4; i++)
{
double sigma = sqrt(pow(scales[i], 2) - 0.25);
//smooth float array image
float* img_vec_smooth = (float*)malloc(image.rows*image.cols*sizeof(float));
vl_imsmooth_f(img_vec_smooth, image.cols, img_vec, image.cols, image.rows, image.cols, sigma, sigma);
//run DSIFT
vl_dsift_process(vlf, img_vec_smooth);
//number of keypoints found
int keypoints_num = vl_dsift_get_keypoint_num(vlf);
//extract keypoints
const VlDsiftKeypoint* vlkeypoints = vl_dsift_get_keypoints(vlf);
//descriptors dimention
int dim = vl_dsift_get_descriptor_size(vlf);
//extract descriptors
const float* descriptors = vl_dsift_get_descriptors(vlf);
...
//return all descriptors of diferent scales
I'm not sure if the return should be the set of all descriptors for all scales, which requires a lot of storage space when we are processing several images; or the result of an operation between descriptors of different scales. Can you help me with this doubt? Thanks
You can do either. The simplest would be to simply concatenate the different levels. I believe this is what VLFeat does (atleast they don't say they do anything more in the documentation). Removing those below your contrast threshold should help, but you'll still have several thousand (depending on the size of your image). But you could compare the descriptors occurring near the same location to prune some out. Its a bit of a time-space trade-off. Generally, I've seen the bin sizes spaced (by intervals of 2, but could be more) which should reduce the need to check for overlapping descriptors.