javascriptwebrtcweb-audio-api

Why can WebAudio's gain node mix multiple inputs when it only has one input?


I was trying to find a way to mix audio streams with WebAudio and record them with a MediaRecorder. I have been following the approach outlined here:

Record multi audio tracks available in a stream with MediaRecorder

It seems that a gain node (in their example, the destination node) can perform mixing on multiple streams, as shown by my experiment below. In the example you can hear both the Caribbean Pirates theme and Universal Studios theme playing when you hit the record button.

However, on MDN it says that the gain node (as well as the destination node) has an input count of 1.

Number of inputs = 1 enter image description here

Intuitively, something with a single input cannot perform mixing! Am I missing anything here?

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Web Audio Track Recorder</title>
</head>
<body>
  <h1>Web Audio Track Recorder</h1>
  <button id="recordButton">Start Recording</button>
  <script src="script.js"></script>
</body>
</html>

script.js

// URLs of the audio tracks:
const track1Url = 'https://archive.org/download/tvtunes_13046/Pirates%20of%20the%20Caribbean%20-%20Hes%20a%20Pirate.mp3';
const track2Url = 'https://archive.org/download/tvtunes_7590/Universal%20Studios.mp3';

// Initialize variables
let audioContext;
let source1, source2;
let mediaRecorder;
let audioChunks = [];
let isRecording = false;

// Function to start the audio context and load tracks
async function startAudio() {
  audioContext = new (window.AudioContext || window.webkitAudioContext)();

  // Load and decode the audio tracks
  const track1Data = await fetch(track1Url).then(response => response.arrayBuffer());
  const track2Data = await fetch(track2Url).then(response => response.arrayBuffer());

  const track1Buffer = await audioContext.decodeAudioData(track1Data);
  const track2Buffer = await audioContext.decodeAudioData(track2Data);

  // Create buffer sources
  source1 = audioContext.createBufferSource();
  source2 = audioContext.createBufferSource();

  source1.buffer = track1Buffer;
  source2.buffer = track2Buffer;

  // Create a gain node to control the volume
  const gainNode = audioContext.createGain();
  gainNode.gain.setValueAtTime(0.25, audioContext.currentTime);

  // Connect sources to the gain node
  source1.connect(gainNode);
  source2.connect(gainNode);

  // Connect gain node to the destination (speakers)
  gainNode.connect(audioContext.destination);

  // Create a MediaStream from the audio context
  const destination = audioContext.createMediaStreamDestination();
  gainNode.connect(destination);

  // Initialize MediaRecorder with the audio stream
  mediaRecorder = new MediaRecorder(destination.stream);

  // Event handler for when data is available
  mediaRecorder.ondataavailable = event => {
    audioChunks.push(event.data);
  };

  // Event handler for when recording stops
  mediaRecorder.onstop = () => {
    const audioBlob = new Blob(audioChunks, { type: 'audio/webm' });
    const audioUrl = URL.createObjectURL(audioBlob);
    const downloadLink = document.createElement('a');
    downloadLink.href = audioUrl;
    downloadLink.download = 'recording.webm';
    downloadLink.click();
    audioChunks = [];
  };

  // Start the sources
  source1.start();
  source2.start();
}

// Function to toggle recording
function toggleRecording() {
  if (!isRecording) {
    if (!audioContext) {
      startAudio().then(() => {
        mediaRecorder.start();
        document.getElementById('recordButton').textContent = 'Stop Recording';
      });
    } else {
      mediaRecorder.start();
      document.getElementById('recordButton').textContent = 'Stop Recording';
    }
  } else {
    mediaRecorder.stop();
    document.getElementById('recordButton').textContent = 'Start Recording';
  }
  isRecording = !isRecording;
}

// Add event listener to the record button
document.getElementById('recordButton').addEventListener('click', toggleRecording);

You can also try it here: https://jsfiddle.net/10ms7pyd/8/


Solution

  • I have found the section that specifies this behavior in the spec. There has been some confusion with input connection, namely:

    You CAN have multiple connections to the same input node! In that case, what the API will do is:

    1. Convert all connected signals to have the number of channels matching the number of channels required by the input. For example, if I have a mono signal, and the input requires stereo, it will be turned to a stereo signal with both channels equal to the mono's single channel.

    2. Then, all input signals, now having the same number of channels, will be simply added together.

    You can find the detailed process in the spec: channel-up-mixing-and-down-mixing.