next.jsspeech-recognitionspeech-to-textopenai-whisper

How to automatically generate subtitles for a video and translate them in NextJS


I want to do the following:

  1. User logging in to my website
  2. He clicks "upload video" and chooses the video from the drive
  3. Video is uploaded to cloud
  4. Once it's uploaded, the speech from video (no matter what language it is) should be converted to text and then translated to English.
  5. File with captions should be saved so that user can pick them from the "Subtitles" menu in video player.

The tricky point is 4) because I don't know which software could I use to generate subtitles from video and translate them to English (and possibly other languages).

I found Whisper AI from OpenAI and I'm wondering whether I should use this one or something else. Also, it's a Python library and I'm wondering how can I actually call this when upload is completed. I'm using NextJS.


Solution

  • There is a package written in nodejs at https://www.npmjs.com/package/openai

    import fs from "fs";
    import OpenAI from "openai";
    const openai = new OpenAI();
    
    async function main() {
      const transcription = await openai.audio.transcriptions.create({
        file: fs.createReadStream("audio.mp3"), // formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
        model: "whisper-1",
      });
    
      console.log(transcription.text);
    }
    main();