pythonc++imageopencvimage-processing

OpenCV's Warp Affine Has Lower Quality Compared to Photoshop


I want to transform and align a detected face (320x240 Size) from a CelebA image (1024x1024 Size) using OpenCV's cv2.warpAffine function but the quality of the transformed image is significantly lower than when I try to align it by hand in Photoshop: (Left Image Is Transformed By Photoshop & Right Image Is Transformed in OpenCV)

Left Image Is Transformed By Photoshop & Right Image Is Transformed in OpenCV

I have used all of the interpolation techniques of OpenCV but none of them came close in quality to Photoshop.

The code I'm using is:

warped = cv2.warpAffine(image, TRANSFORM_MATRIX, (240, 320), flags=cv2.INTER_AREA)

What could be wrong that made the transformed image have such low quality?

Here's a Link to the original 1024x1024 image if needed.


Solution

  • Problem and general solution

    You are down-sampling a signal.

    The approach is always the same:

    What not to do

    If you don't do the lowpass, you'll get aliasing. You noticed that. Aliasing means the sampling step can completely miss some high frequency component (edge/corner/point/...), giving those strange artefacts.

    If you do the lowpass after resampling, it won't fix the issue, only hide it. The damage has already been done.

    You can convince yourself of both these aspects if you downsample some regular grid of strongly contrasting lines. Try alternating single-pixel lines of black and white for most effect.

    Implementations

    Libraries such as PIL do the lowpass implicitly before resampling.

    OpenCV does not (kinda, in general). Not even with Lanczos interpolation (in OpenCV) will you be able to skip the lowpassing, because OpenCV's Lanczos has a fixed coefficient.

    OpenCV has INTER_AREA, which is a linear interpolation, but it additionally sums over all pixels that are in the area between the corner samples (instead of just sampling those four corners). This can spare you the extra lowpass step.

    here's the result of cv.resize(im, (240, 240), interpolation=cv.INTER_AREA):

    resize

    Here's the result of cv.warpAffine(im, M[:2], (240, 240), interpolation=cv.INTER_AREA) with M = np.eye(3) * 0.25 (equivalent scaling):

    warpAffine

    It appears that warpAffine can't do INTER_AREA. That sucks for you :/

    If you need to downsample with OpenCV, and it's a power of two, you can use pyrDown. That does the lowpass and decimation... for a factor of two. Repeated application gives you higher powers.

    If you need arbitrary downsampling and you don't like INTER_AREA for some reason, you'd have to apply a GaussianBlur to the input. Sigma needs to be (inversely) proportional to the scale factor. There is some relation between the gaussian filter's sigma and the resulting cutoff frequency. You'll want to investigate that some more, if you don't want to pick a value arbitrarily. Check out the kernel for pyrDown, and what gaussian sigma it matches best. That's probably a good value for a scale factor of 0.5, and other factors should be (inversely) proportional.

    For simple downscaling, one gaussian blur would be fine. For affine warps and higher transformations, you'd need to apply lowpassing that respects the different scale for every single pixel that is looked up, because their "support" in the source image isn't square any longer, maybe not even rectangular, but an arbitrary quad!

    What am I not saying?

    This goes for down-sampling. If you up-sample, do not lowpass.