image-processingimage-segmentationgoogle-visiongoogle-mlkit

Poor selfie segmentation with Google ML Kit


I am using Google ML Kit to do selfie segmentation (https://developers.google.com/ml-kit/vision/selfie-segmentation). However, the output am getting is exteremely poor -

Initial image:

enter image description here

Segmented image with overlay: Observe how the woman's hair is marked pink and the gym equipment and surrounds near her legs are marked non-pink. Even her hands are marked pink (meaning its a background).

enter image description here

When this is overlayed on another image, to create a background removal effect, it looks terrible

enter image description here

The segmentation mask returned by the ML Kit has confidence of 1.0 for all the above non-pink areas, meaning its absolutely certain that the areas non-pink are part of the person!!

Am seeing this for several images, not just this one. Infact, the performance (confidence) is pretty poor for an image segmenter.

Question is - is there a way to improve it, maybe by providing a different/better model? If I use something like the PixelLib, the segmentation is way better, albeit the performance of the library is not low latency, hence can't be run on the mobile.

Any pointers/help regardig this would be really appreciated.


Solution

  • It might be too optimistic to expect a lightweight real-time CPU-based selfie model to provide accurate segmentation results for a pretty complex and in a way tricky scene (pose, black color of the background and outfit).

    Official example highlights the fact complex environments will likely to be a problem.

    enter image description here

    The only "simple" way of processing your scene is to use depth estimation. Just did a quick test with a pretty complex model:

    enter image description here

    Results are too far from being usable (at least in a fully automated way). There are several other options:

    enter image description here