djangogoogle-cloud-platformspeech-recognitionvideo-intelligence-api

Google Cloud VideoIntelligence Speech Transcription - Transcription Size


I use Google Cloud Speech Transcription as following :

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.enums.Feature.SPEECH_TRANSCRIPTION]
operation = video_client.annotate_video(gs_video_path, features=features)
result = operation.result(timeout=3600)

And I present the transcript and store the transcript in Django Objects using PostgreSQL as following :

transcriptions = response.annotation_results[0].speech_transcriptions
for transcription in transcriptions:
    best_alternative = transcription.alternatives[0]
    confidence = best_alternative.confidence
    transcript = best_alternative.transcript
    if SpeechTranscript.objects.filter(text = transcript).count() == 0:
        SpeechTranscript.objects.create(text = transcript,
                                        confidence = confidence)
        print(f"Adding -> {confidence:4.10%} | {transcript.strip()}")
    else:
        pass

For instance the following is the text that I receive from a sample video :

 94.9425220490% | I refuse to where is it short sleeve dress shirt. I'm just not going there the president of the United States is a visit to Walter Reed hospital in mid-july format was the combination of weeks of cajoling by trump staff and allies to get the presents for both public health and political perspective wearing a mask to protect against the spread of covid-19 reported in advance of that watery trip and I quote one presidential aide to the president to set an example for a supporters by wearing a mask and the visit.
 94.3865835667% | Mask wearing is because well science our best way to slow the spread of the coronavirus. Yes trump or Matthew or 3 but if you know what he said while doing sell it still anybody's guess about what can you really think about NASCAR here is what probably have a mass give you probably have a hospital especially and that particular setting were you talking to a lot of people I think it's but I do believe it. Have a a time and a place very special line trump saying I've never been against masks but I do believe they have a time and a place that isn't exactly a ringing endorsement for mask wearing.
 94.8513686657% | Republican skip this isn't it up to four men over the perfumer's that wine about time and place should be a blinking red warning light for people who think debate over whether last for you for next coronavirus. They are is finally behind us time in a place lined everything you need to know about weird Trump is like headed next time he'll get watery because it was a hospital and will continue to express not so scepticism to wear masks in public house new CDC guidelines recommending that mask to be worn inside and one social this thing is it possible outside he sent this?
 92.9862976074% | He wearing a face mask as agreed presidents prime minister's dictators Kings Queens and somehow. I don't see it for myself literally main door he responded this way back backstage, but they said you didn't need it trump went to Michigan to this later and he appeared in which personality approaching Mark former vice president Joe Biden
 94.6677267551% | In his microwave fighting for wearing a mask and he walked onto the stage where it is massive mask there's nobody understands and there's any takes it off you like to have it hanging off you. I think it makes them feel good frankly if you want to know the truth who's got the largest basket together. Seen it because trump thinks that maths make him and people generally I guess what a week or something is resistant wearing one in public from 1 today which has had a correlation between the erosion of the public's confidence and trump have the corner coronavirus and his number is SE6 a second term in the 67.
 94.9921131134% | The coronavirus pandemic in the heels of national and swings they both lots of them that show trump slipping further and further behind former vice president Joe Biden when it comes to General Election good policy would seem to make for good politics at all virtually every infectious disease expert believes that wearing masks in public is our best to contain the spread of coronavirus until a vaccine would do well to listen to buy on this one a mare is the point we make episode every Tuesday and Thursday make sure to check them all out.

What is the predicted size of a transcript that is generated within the speech transcription results. What decides the size of each transcript ? What is the max and minimum character length ? How should I design my SQL table column size, in order to be prepared for the expected transcript size ?


Solution

  • As I mentioned in the comments, the Video Intelligence transcripts are splits with roughly 50-60 seconds from the video.

    I have created a Public Issue Tracker case, link, so the product team can clarify this information within the documentation. Although, I do not have an eta for this request, I encourage you to follow the case's thread.