I have two MLModel
s in my app. The first one is generating an MLMultiArray
output which is meant to be used as the second model input.
As I'm trying to make things as performance-best as possible. I was thinking about using VNImageRequestHandler
to feed it with the first model output (MLMultiArray
) and use Vision
resize and rectOfIntersent to avoid converting the first input to an image, crop features, to avoid the need to convert the first output to image, do everything manually and use the regular image initializer.
Something like that:
let request = VNCoreMLRequest(model: mlModel) { (request, error) in
// handle logic?
}
request.regionOfInterest = // my region
let handler = VNImageRequestHandler(multiArray: myFirstModelOutputMultiArray)
Or I have to go through back and forth conversions? Trying to reduce processing delays.
Vision uses images (hence the name ;-) ). If you don't want to use images, you need to use the Core ML API directly.
If the output from the first model really is an image, it's easiest to change that model's output type to an image so that you get a CVPixelBuffer instead of an MLMultiArray. Then you can directly pass this CVPixelBuffer into the next model using Vision.