firebase-mlkitobject-recognitiongoogle-mlkit

How to recognize and count objects with Firebase / ML Kit


I'd like to recognize and count objects in a picture, e.g. count the number of houses in a picture of a neighbourhood. What's the best way to do this with ML Kit?

Do I need to use the Object Detection API? Or is it possible to get multiple "house" tags using a straight-forward image-labeler?


Solution

  • The ML Kit Object Detection API (note that it is now offered as a standalone SDK) can count objects in an image / video stream, but it limited to the 5 largest objects. Also, you should evaluate if the object detection works for your use case. It is a very general localizer and works for most objects, however with when objects are close together / overlapping it may not distinguish between them.

    If you need to detect more than 5 objects, I would recommend looking at directly using TensorFlow Lite with some of the pre-trained models available on TF Hub or train one yourself using AutoML Vision Edge if the general models don't fit your use case.

    Fwiw, Image Labeling assigns labels that describe the scene of an image. However, it does not count the number of objects, you typically get a single label "house".