[SOLVED] Google Cloud Video Intelligence API on S3 objects

Google Cloud Video Intelligence API on S3 objects

What is the best way to process/analyze S3 objects using Google Cloud Video Intelligence API ? My current plan is to copy S3 object to Google Cloud Storage (GCS) and then call the API. To copy from S3 to GCS, looks like Google Cloud Transfer Service API is the only option.

My desired flow is:

user uploads to S3.
My backend copies from S3 to GCS using Google Cloud Transfer Service API.
Run Google Cloud Video Intelligence API on the copied object.
Retrieve the results.
Delete the copied object.

Is there a better alternative which can avoid the copying ? If not, is Transfer Service API the correct choice to copy individual objects ?

Thanks.

Solution

If you must store data in S3 as the authoritative source, then I think that your current plan is probably the best one. If you can use GCS as your home for data, that'd obviously make things easier for this one particular task.

Google Cloud's APIs want to have easy and fast access to the data (rather than trying to pull it down from some remote service such as S3). This means that the only reasonable place to keep that data (from Google's perspective) is in GCS.

Google Cloud Storage's Transfer Service is definitely the right option, and allows you to schedule recurring transfers if that makes sense for your use case or trigger one-off transfers on-demand. In the case of S3 as your data source, you can also apply filters to include or exclude (e.g., directory prefixes) and restrict transfers based on their last modification time (as reported by S3).

Above you can see how to filter files to transfer by prefix, and also only transfer files that were changed in the last 24 hours.