google-cloud-platformvideo-intelligence-api

GCP Video Intelligence API Object Tracking


I've used the Video Intelligence API to do object tracking on video.

In the document [1], it recognizes more than 20,000 objects, places, and actions in stored and streaming video.

I have a questions. Is there any document that shows what kind of objects can be recognized or can't be recognized?

It's my first question. Thank you.

[1] https://cloud.google.com/video-intelligence


Solution

  • In this GCP documentation, it enumerates the categories in which Cloud Video Intelligence API can detect, analyze, track, transcribe and recognize: https://cloud.google.com/video-intelligence/docs/how-to

    Among the things that are listed on the GCP documentation that Cloud Video Intelligence API can detect, track and recognize are: faces, people, shot changes, explicit content, objects, logos and text. Cloud Video Intelligence API are already pre-trained, if in case there are objects that Cloud Video Intelligence API can't recognize, you can train your own custom models using AutoML Video Intelligence. To get started with AutoML Video Intelligence, you can refer to this GCP documentation: https://cloud.google.com/video-intelligence/automl/docs/beginners-guide

    As to the limitation of object that can be recognized in Cloud Video Intelligence API, there is no document that states which object are not recognizable. The only limits that are in the Cloud Video Intelligence API documentation are in terms of video size, per request and length. GCP Documentation: https://cloud.google.com/video-intelligence/quotas