I have a nodejs project with an index.js
, within that I have set up recognize
api call:
app.post('/recognize', async (req, res) => {
try {
const audio = req.body.audio;
// Use Google Cloud Speech-to-Text API to convert audio to text
const speechClient = new SpeechClient();
const request = {
audio: {
content: audio,
},
config: {
encoding: 'FLAC',
sampleRateHertz: 48000,
languageCode: 'en-US',
},
};
const [response] = await speechClient.recognize(request);
console.log(response.results)
const transcription = response.results
.map((result) => result.alternatives[0].transcript)
.join('\n');
res.json({ transcription: transcription });
} catch (error) {
console.error(error);
console.error(error.message);
res.status(500).json({ error: error });
}
});
on the frontend in order to send the audio file I am doing:
sendAudioToBackend(audioBase64: string) {
this.http.post('http://localhost:3000/recognize', { audio: audioBase64 }).subscribe((response: any) => {
const transcription = response.transcription;
// Process the transcribed text and send it to the backend
this.sendMessage(transcription);
});
}
This works pretty well, the API is receiving a payload with a valid base64 audio string. However, the response is an empty {"transcription":""}
.
I tried adding Google Cloud Error reporting, but it wasn't specific. I couldn't find where the issue is lying.
Does anyone have an idea how to get a successful response back?
Thank you
This answer provided it to me. I changed my encode to WEBM_OPUS
and now it works.
Google Cloud Speech-to-Text returns empty transcription for OGG OPUS Base64 audio