azure-cognitive-servicesgoogle-vision

How many objects these Computer Vision API can detect?


https://learn.microsoft.com/fr-fr/azure/cognitive-services/computer-vision/concept-object-detection

https://cloud.google.com/vision/docs/object-localizer

I would want to know how many and which objects are recognizable using theses APIs and I can't find a mention of that fact.

I found that google API use https://developers.google.com/knowledge-graph/ which is based on schema.org types but I don't really understand well what it's all about.


Solution

  • I'm sorry but as far as I know, there is no fixed list of classes that Azure Computer Vision is able to detect.

    By the way, even if there was one, this list is evolving on a regular basis (but no schedule is announced).

    In any case, there are limitations (see doc here):

    It's important to note the limitations of object detection so you can avoid or mitigate the effects of false negatives (missed objects) and limited detail.

    • Objects are generally not detected if they're small (less than 5% of the image).
    • Objects are generally not detected if they're arranged closely together (a stack of plates, for example).
    • Objects are not differentiated by brand or product names (different types of sodas on a store shelf, for example). However, you can get brand information from an image by using the Brand detection feature.

    If you want to detect specific objects, I would highly suggest using Custom Vision (doc / overview here), not Computer Vision, where you can train your model with your own images to match what you are trying to detect