I'm working on a webapp where the user can interact with a digital instrument UI to play different music notes, and those notes end up being streamed to my backend. I'm using Web Audio API to convert the notes that are being played into AudioBufferSourceNodes, and am trying to pass the played audio to a stream that my (MediaRecorder API) recorder is listening to in order to generate a final recording playback of all the notes that the user has played.
The final recording contains all the notes played, the problem is that it doesn't account for pauses in between the nodes. For example, if a user plays a note that lasts 1 second long, waits 5 seconds, then plays that same note again ... the final recording will have those two notes play one after another in the span of 2 seconds, as opposed to playing note for first time, waiting 5 seconds, then playing note for second time.
Here's what my WebAudio API and MediaRecorder API code looks like:
//initialize recording stream components
let chunks = [];
let context = new AudioContext();
let stream_destination = context.createMediaStreamDestination();
let recording = new MediaRecorder(stream_destination.stream);
recording.ondataavailable = event => {
chunks.push(event.data)
}
recording.onstop = () => {
document.getElementById("recording").src = URL.createObjectURL(new Blob(chunks,{'type':'audio/mp3'}))
}
document.getElementById("start").addEventListener("click", () => { recording.start(1000); console.log("recording started"); })
document.getElementById("stop").addEventListener("click", () => { recording.stop(); console.log("recording stopped"); })
// play audio file to user and create sourceNode to add to MediaRecorder
document.getElementById('sound-button').addEventListener('click', () => {
const audioFilePath = './media/temp1.mp3';
const audio = new Audio(audioFilePath);
audio.play();
let request = new XMLHttpRequest();
request.open("GET", audioFilePath, true);
request.responseType = "arraybuffer";
request.onload = () => {
let audioData = request.response;
context.decodeAudioData(
audioData,
(buf) => {
// add the played buffer to buffer history
recordedAudioBuffers.push(buf);
console.log("buffer was succesfully added to buffer history. buffer history: " + JSON.stringify(recordedAudioBuffers));
const sourceNode = new AudioBufferSourceNode(context, {buffer: buf})
sourceNode.connect(stream_destination);
sourceNode.start();
}
)
}
request.send();
});
Initially, I thought that I could manipulate the timing params for the sourceNode.start() method in order to achieve the result I wanted, but still did not get the final recording I was looking for.
I also looked into seeing if it'd be possible to integrate "empty" sourceNodes in between the sourceNodes that represent the notes played by the users in order to simulate the pauses in between the notes that I was looking for in the final recording, but after looking into online options, it seemed like this was unnecessarily complicating my code/implementation.
As far as I know this is only a problem in Firefox. It can be fixed with a silent ConstantSourceNode
.
const constantSourceNode = new ConstantSourceNode(
context,
{ offset: 0 }
);
constantSourceNode.connect(stream_destination);
constantSourceNode.start();