azureazure-cognitive-services

VTT output for Azure Transcription JSON file


I looked up and found this - https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/captioning-concepts?pivots=programming-language-javascript

In Caption output format section, it says -

The Speech service supports output formats such as SRT (SubRip Text) and WebVTT (Web Video Text Tracks).

But there is no option to set output format in API reference - https://eastus.dev.cognitive.microsoft.com/docs/services/speech-to-text-api-v3-0/operations/CreateTranscription

I am using Create Transcription API to send video/audio files > 30 minutes and Azure gives transcription result in JSON like following -

https://spsvcprodeus.blob.core.windows.net/bestor-c6e3ae79-1b48-41bf-92ff-940bea3e5c2d/TranscriptionData/1a7f53a1-b254-4edc-a03a-20aa926423b7_0_0.json?sv=2021-08-06&st=2022-11-09T19%3A05%3A26Z&se=2022-11-10T07%3A10%3A26Z&sr=b&sp=rl&sig=4g80znxLM%2FVhCJI7iJLNETGd%2B%2B442eubSOQikjQpvZU%3D

I'm planning to write a script to convert transcription JSON to VTT, but it will be really helpful if that is already there or something I can request as output format.


Solution

  • The speech key needs to be retrieved to make it work. Create the speech service in azure portal and get the supportive python code to convert speech to text.

    enter image description here

    enter image description here

    Get the python code (captioning) to speech to text.

    To set the environment:

    setx SPEECH_KEY your-key
    

    Create caption from the speech

    Go to the same directory where the code was available.

    pip install azure-cognitiveservices-speech
    

    Run the application:

    python captioning.py --input caption.this.mp4 --format any --output caption.output.txt --srt --realTime --threshold 5 --delay 0 --profanity mask --phrases "Contoso;Jessie;Rehaan"

    To check for the SRT format -> Link

    enter image description here

    We have the duration limit for every service in azure. Check for the quota and support with the link.