iosvideo-processingmetalcore-imagehdr

How to correctly handle HDR10 in custom Metal Core Image Kernel?


I've got a custom Metal Core Image kernel (written with CIImageProcessorKernel) that I'm trying to make work properly with HDR video (HDR10 PQ to start).

I understand that for HDR video, the rgb values coming into the shader can have values below 0.0 or above 1.0. However, I don't understand how the 10-bit integer values (ie. 0-1023) in the video are mapped into floating point.

What are the minimum and maximum values in floating point? ie. What will a 1023 (pure white) pixel be in floating point in the shader.

At 11:32 in WWDC20 session 10009, Edit and play back HDR video with AVFoundation, there's an example of a Core Image Metal kernel that isn't HDR aware and therefore won't work. It's inverting the values that come in by subtracting them from 1.0, which clearly breaks down when 1.0 is not the maximum possible value. How should this be implemented to be HDR aware?

extern “C” float4 ColorInverter(coreimage::sample_t s, coreimage::destination dest) {  
    return float4(1.0 - s.r, 1.0 - s.g, 1.0 - s.b, 1.0);
}

Solution

  • In your kernel, colors are usually normalized to [0.0 ... 1.0], based on the underlying color space. So even if values are stored in 10-bit inters in a texture, your shader will get them as normalized floats.

    I emphasized the color space above because it is used when translating the colors from the source into those normalized values. When you are using the default sRGB color space, the wide gamut from the HDR source doesn't fit into the sRGB [0.0 ... 1.0] spectrum. That's why you may get values outside that range in your kernel. This is actually useful in most cases because most filter operations that are designed for sRGB still work then. The color invert example above, however, is not.

    You have two options here that I know of:

    You can change the workingColorSpace of the CIContext you are using to the HDR color space of the input:

    let ciContext = CIContext(options: [.workingColorSpace: CGColorSpace(name: CGColorSpace.itur_2020)!])
    

    Then all color values should be capped to [0.0 ... 1.0] in your kernel, where 0.0 is the darkest HDR color value and 1.0 is the brightest. You can safely perform the inversion with 1.0 - x then. However, keep in mind that some other filters will then not produce the correct result because they assume the input to be in (linear) sRGB—Core Image's default.

    The second option is that you convert ("color match") the input into the correct color space before passing it into your kernel and back to working space again before returning:

    let colorSpace = CGColorSpace(name: CGColorSpace.itur_2020)!
    let colorMatchedInput = inputImage.matchedFromWorkingSpace(to: colorSpace)
    let kernelOutput = myKernel.apply(..., [colorMatchedInput, ...])
    return kernelOutput.matchedToWorkingSpace(from: colorSpace)