I'm trying to create a Node.js WebSocket server that receives audio data in the form of a base64-encoded string from Twilio. The decoded audio data is then written to the speaker using the write() method. Twilio says that it sends data in "audio/x-mulaw" form in base64.
However, when I run the code, the speaker outputs bursts of static instead of the expected audio. I'm not sure what's causing this issue. The busts of static do correspond with my speaking into the mic but it is not at all recognizable.
Here's my code:
import { WebSocketServer } from 'ws';
import Speaker from 'speaker';
import alawmulaw from 'alawmulaw';
// Create a new Speaker instance with the specified format
const speaker = new Speaker();
const wss = new WebSocketServer({ port: 5000 });
wss.on('connection', function connection(ws) {
ws.on('message', function message(data) {
let obj = JSON.parse(data);
if (obj.event === "media") {
let buff = Buffer.from(obj.media.payload, 'base64');
let PCM = Buffer.from(alawmulaw.mulaw.decode(buff));
speaker.write(PCM);
}
});
});
I'm relatively confident that it is an issue with the encoding but I have tried various configurations and nothing has worked so far. I would greatly appreciate it if anyone could share some ideas on how to work through this. Thanks!
Configure your speaker like this.
const speaker = new Speaker({
float: false, // use PCM data
signed: true, // 16-bit signed PCM
bitDepth: 16, // match the bit depth to your audio data
channels: 1, // mono audio
sampleRate: 8000, // match this to your audio data's sample rate
samplesPerFrame: 1, // default value
});
also decode each track seperatly.