opencvimage-processinginterpolationlookup-tableslinear-interpolation

How does the "compressed form" of `cv::convertMaps` work?


The documentation for convertMaps says that it supports the following transformation:

(CV_32FC1, CV_32FC1)→(CV_16SC2, CV_16UC1) This is the most frequently used conversion operation, in which the original floating-point maps (see remap) are converted to a more compact and much faster fixed-point representation. The first output array contains the rounded coordinates and the second array (created only when nninterpolation=false) contains indices in the interpolation tables.

I understand that (CV_32FC1, CV_32FC1) is encoding (x, y) coordinates as floats. How does the fixed point format work? What is encoded in each 2-channel entry of the CV_16SC2 matrix? What interpolation tables does the CV_16UC1 matrix index into?


Solution

  • I'm going by what I remember from the last time I investigated this. Grain of salt and all that.

    The fixed point format splits the integer and fractional parts of your (x,y)-coordinates into different maps.

    It is "compact" in that CV_32FC2 or 2x CV_32FC1 uses 8 bytes per pixel, while CV_16SC2 + CV_16UC1 uses only 6 bytes per pixel. Also it's integer-only, so using it can free up floating point compute resources for other work.

    The integer parts go into the first map, which is 2-channel. No surprises there.

    The fractional parts are converted to 5-bit integers, i.e. they're multiplied by 32. Then they're packed together, the 5 fractional bits from one coordinate on the bottom, 5 bits from the other one on top of that.

    The resulting funny number has a range of 0 .. 1023, or 0b00000_00000 .. 0b11111_11111, which encodes fractional parts (0.0, 0.0) and (0.96875, 0.96875) respectively. If you guessed that 0.96875 is 31/32, you're right!

    During remap...

    The integer map is used to look up, for every resulting pixel, several pixels in the source image required for interpolation. For linear interpolation, that would be pixels (i,j), (i,j+1), (i+1,j), (i+1,j+1).

    The fractional map is taken as an index into a 1024-entry lookup table that contains interpolation coefficients. OpenCV has several tables, one for each interpolation mode (linear, cubic, ...). These tables contain whatever factors and shifts required to correctly blend the several sampled pixels into one resulting pixel, all using integer arithmetic (fixed point arithmetic).