I'm building a React component that streams audio and transcribes it in real-time using a WebSocket connection. I have a state variable currentTranscription that should accumulate the transcriptions as they come in. However, I am encountering an issue where currentTranscription is empty when I try to call setTranscription, when data.speech_final is true, even tho i see its content in the UI.
I guess it's something related to the asynchronous behaviour of the setState but I can't figure it out what's the issue.
Here’s a simplified version of my code:
import { useState, useEffect, useRef } from 'react';
function Interview() {
const [currentTranscription, setCurrentTranscription] = useState('');
const [transcriptions, setTranscriptions] = useState<string[]>([]);
const socket = useRef<WebSocket | null>(null);
useEffect(() => {
socket.current = new WebSocket('ws://localhost:3000');
socket.current.addEventListener('message', (event) => {
const data = JSON.parse(event.data);
if (data.channel && data.channel.alternatives) {
const transcript = data.channel.alternatives[0].transcript;
if (transcript && data.is_final) {
setCurrentTranscription((prevTranscription) => prevTranscription + ' ' + transcript);
}
if (data.speech_final) {
// currentTranscription is empty
setTranscriptions((prevTranscriptions) => [...prevTranscriptions, currentTranscription]);
setCurrentTranscription('');
}
}
});
return () => {
socket.current?.close();
};
}, []);
return <div>{currentTranscription}</div>;
}
The issue you are having is due to how js closures work with event listeners and state. Basically when the message
event listener is created is gets a snapshot of the value for currentTranscription
. That snapshot value never updates, so when the data.speech_final
case is hit, even though state has updated, the event listener only has the stale snapshot of state.
One way to handle this would be to listen for an end event, in this case an increment of the index, and have a useEffect that handles cleanup.
Like so.
function Interview() {
const [currentTranscription, setCurrentTranscription] = useState('');
const [transcriptions, setTranscriptions] = useState<string[]>([]);
const [index, setIndex] = useState<number>(0)
const socket = useRef<WebSocket | null>(null);
useEffect(() => {
socket.current = new WebSocket('ws://localhost:3000');
socket.current.addEventListener('message', (event) => {
const data = JSON.parse(event.data);
if (data.channel && data.channel.alternatives) {
const transcript = data.channel.alternatives[0].transcript;
if (transcript && data.is_final) {
setCurrentTranscription((prevTranscription) => prevTranscription + ' ' + transcript);
}
if (data.speech_final) {
setIndex(prev => prev + 1);
}
}
});
return () => {
socket.current?.close();
};
}, []);
useEffect(()=>{
if(index === 0) return;
setTranscriptions((prevTranscriptions) => [...prevTranscriptions,
currentTranscription]);
setCurrentTranscription('');
},[index])
return <div>{currentTranscription}</div>;
}