javascriptelectronspeech-recognitionspeech-to-text

What are the ways to implement speech recognition in Electron?


So I have an Electron app that uses the web speech API (SpeechRecognition) to take the user's voice, however, it's not working. The code:

if ("webkitSpeechRecognition" in window) {
  let SpeechRecognition =
    window.SpeechRecognition || window.webkitSpeechRecognition;
  let recognition = new SpeechRecognition();

  recognition.onstart = () => {
    console.log("We are listening. Try speaking into the microphone.");
  };

  recognition.onspeechend = () => {
    recognition.stop();
  };

  recognition.onresult = (event) => {
    let transcript = event.results[0][0].transcript;
    console.log(transcript);
  };

  recognition.start();
} else {
  alert("Browser not supported.");
}

It says We are listening... in the console, but no matter what you say, it doesn't give an output. On the other hand, running the exact same thing in Google Chrome works and whatever I say gets console logged out with the console.log(transcript); part. I did some more research and it turns out that Google has recently stopped support for the Web Speech API in shell-based Chromium windows (Tmk, everything that is not Google Chrome or MS Edge), so that seems to be the reason it is not working on my Electron app.

See: electron-speech library's end Artyom.js issue another stackOverflow question regarding this

So is there any way I can get it to work in Electron?


Solution

  • I ended up doing an implementation that uses the media devices API to get the user's speech through their microphone and then sends it to a Python server using WebSockets which uses the audio stream with the SpeechRecognition pip package and returns the transcribed text to the client (Electron app).

    This is what I implemented, it is way too long for a thing as simple as this, but if someone has a better suggestion, please do let me know by writing an answer.