iosswifthistogrammetalmetal-performance-shaders

MPSImageHistogramEqualization throws assertion that offset must be < [buffer length]


I'm trying to do histogram equalization using MPSImageHistogramEqualization on iOS but it ends up throwin an assertion I do not understand. Here is my code:

    // Calculate Histogram
    var histogramInfo = MPSImageHistogramInfo(
        numberOfHistogramEntries: 256,
        histogramForAlpha: false,
        minPixelValue: vector_float4(0,0,0,0),
        maxPixelValue: vector_float4(1,1,1,1))
    let calculation = MPSImageHistogram(device: self.mtlDevice, histogramInfo: &histogramInfo)
    let bufferLength = calculation.histogramSize(forSourceFormat: sourceTexture.pixelFormat)
    let histogramInfoBuffer = self.mtlDevice.makeBuffer(length: bufferLength, options: [.storageModePrivate])!
    calculation.encode(to: commandBuffer,
                       sourceTexture: sourceTexture,
                       histogram: histogramInfoBuffer,
                       histogramOffset: 0)
    let histogramEqualization = MPSImageHistogramEqualization(device: self.mtlDevice, histogramInfo: &histogramInfo)
    histogramEqualization.encodeTransform(to: commandBuffer, sourceTexture: sourceTexture, histogram: histogramInfoBuffer, histogramOffset: 0)

And here is the resulting assert that happens on that last line:

-[MTLDebugComputeCommandEncoder setBuffer:offset:atIndex:]:283: failed assertion `offset(4096) must be < [buffer length](4096).'

Any suggestions on what might be going on here?


Solution

  • This appears to be a bug in a specialized path in MPSImageHistogramEqualization, and I encourage you to file feedback on it.

    When numberOfHistogramEntries is greater than 256, the image kernel allocates an internal buffer large enough to hold the data it needs to work with (for N=512, this is 8192 bytes), plus an extra bit of space (32 bytes). When the internal optimized256BinsUseCase flag is set, it allocates exactly 4096 bytes, omitting that last bit of extra storage. My suspicion is that subsequent operations rely on having more space after the initial data chunk, and inadvertently set the buffer offset past the length of the internal buffer.

    You may be able to work around this by using a different number of histogram bins, like 512. This wastes a little space and time, but I assume it will produce the same results.

    Alternatively, you might be able to avoid this crash by disabling the Metal validation layer, but I strongly discourage that, since you'll just be masking the underlying issue till it gets fixed.

    Note: I did my reverse-engineering of the MetalPerformanceShaders framework on macOS Catalina. Different platforms and different software versions likely have different code paths.