[SOLVED] MLModel works with MultiArray output but cannot successfully change the output to an image

MLModel works with MultiArray output but cannot successfully change the output to an image

I have converted a Keras model to a MLModel using coremltools 4.0 with limited success.

It works but only if I use an MLMultiArray for the output and covert to an image. Converting to an image takes magnitudes longer than inferencing; making it unusable.

If I try to change the MLModel spec to use images for output I get this error running prediction:

Failed to convert output Identity to image:

NSUnderlyingError=0x2809bad00 {Error Domain=com.apple.CoreML Code=0 "Invalid array shape ( 2048, 2048, 3 ) for converting to gray image"

Even though I have specified RGB for the output color:

output { name: "Identity" type { imageType { width: 2048 height: 2048 colorSpace: RGB } } }

If I use a MultiArray (that works) Xcode reports:

output: Float32 1 x 2048 x 2048 x 3 array

I suspect the issue is the first dimension, which is the batch number but no dimensions are shown, so I can't delete the batch dimension:

output { name: "Identity" type { multiArrayType { dataType: FLOAT32 } } }

I don't think I can just add an output shape to the Keras Conv2D output layer because it has multiple inbound nodes with different shapes. Here are the output shapes:

>>> print(outputLayer.get_output_shape_at(0))
(None, None, None, 3)
>>> print(outputLayer.get_output_shape_at(1))
(1, 512, 512, 3)
>>> print(outputLayer.get_output_shape_at(2))
(1, 2048, 2048, 3)

>>> print(outputLayer.output)
Tensor("SR/Identity:0", shape=(None, None, None, 3), dtype=float32)

I think coremltools is confusing the batch for the channels that is why it is attempting to create a grayscale image even if I specify RGB.

Any idea how to fix it?

I have the original Keras model, but I don't see how specify shapes without a batch dimension. Here is the beginning and ending of the Keras model layer description

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
LR_input (InputLayer)           [(None, None, None,  0                                            
__________________________________________________________________________________________________
Pre_blocks_conv (Conv2D)        multiple             896         LR_input[0][0]                   
__________________________________________________________________________________________________
F_1_1_1 (Conv2D)                multiple             9248        Pre_blocks_conv[0][0]            

...                             multiple
...                             multiple

SR (Conv2D)                     multiple             84          PixelShuffle[0][0]               
==================================================================================================

Solution

In Core ML the order of the dimensions is (channels, height, width) so it expects to see a 3 x 2048 x 2048 output instead of 2048 x 2048 x 3.

Note that you also need to make sure the output pixels are in the range [0, 255] instead of [0, 1] which is probably what your Keras model gives you.