node.jsgoogle-cloud-platformgoogle-cloud-speech

Google Cloud Speech - returns empty transcription, despite accurate payload


I have a nodejs project with an index.js, within that I have set up recognize api call:

app.post('/recognize', async (req, res) => {
  try {
    const audio = req.body.audio;

    // Use Google Cloud Speech-to-Text API to convert audio to text
    const speechClient = new SpeechClient();

    const request = {
      audio: {
        content: audio,
      },
      config: {
        encoding: 'FLAC',
        sampleRateHertz: 48000,
        languageCode: 'en-US',
      },
    };

    const [response] = await speechClient.recognize(request);
    console.log(response.results)
    const transcription = response.results
      .map((result) => result.alternatives[0].transcript)
      .join('\n');

    res.json({ transcription: transcription });
  } catch (error) {
    console.error(error);
    console.error(error.message);
    res.status(500).json({ error: error });
  }
});

on the frontend in order to send the audio file I am doing:

  sendAudioToBackend(audioBase64: string) {
    this.http.post('http://localhost:3000/recognize', { audio: audioBase64 }).subscribe((response: any) => {
      const transcription = response.transcription;

      // Process the transcribed text and send it to the backend
      this.sendMessage(transcription);
    });
  }

This works pretty well, the API is receiving a payload with a valid base64 audio string. However, the response is an empty {"transcription":""}.

I tried adding Google Cloud Error reporting, but it wasn't specific. I couldn't find where the issue is lying.

Does anyone have an idea how to get a successful response back?

Thank you


Solution

  • This answer provided it to me. I changed my encode to WEBM_OPUS and now it works.

    Google Cloud Speech-to-Text returns empty transcription for OGG OPUS Base64 audio