javascriptlive-streamingmedia-source

How to use MediaSource for live streaming frame-by-frame PNG images over Socket.io?


I'm trying to set up a live stream using Socket.io to deliver PNG frames from the server. For now, I want to test it with MediaSource Extensions since I plan to add audio later, which is why I'm using MediaSource instead of just updating an image tag or canvas. Here’s what I have so far:

import { io } from "https://cdn.socket.io/4.7.5/socket.io.esm.min.js";

const videoElement = document.getElementById('videoPlayer');
const mediaSource = new MediaSource();
videoElement.src = URL.createObjectURL(mediaSource);

const socket = io();

mediaSource.addEventListener('sourceopen', () => {
    sourceBuffer = mediaSource.addSourceBuffer('video/mp4; codecs="avc1.42E01E, mp4a.40.2"');

    socket.on('video-frame', (frame) => {
        const frameBlob = new Uint8Array(frame);

        sourceBuffer.appendBuffer(frameBlob);
    });
});

mediaSource.addEventListener("sourceclose", () => {
    console.log("Source Closed");
});

The problem is that I keep having an exception with the SourceBuffer:

Uncaught (in promise) InvalidStateError: Failed to execute 'appendBuffer' on SourceBuffer': This SourceBuffer has been removed from the parent MediaSource.

This is because the MediaSource closes before I add any frame to the video. Why is that the case? And how can I use MediaSource Extensions to achieve live streaming with frame-by-frame PNG images?


Solution

  • I'm trying to set up a live stream using Socket.io to deliver PNG frames from the server.

    Why? This is probably one of the least efficient ways you could stream video this way. Is there a reason you need these losslessly encoded frames?

    For now, I want to test it with MediaSource Extensions

    Well, you can't, because MSE doesn't support just shoving arbitrary frames into a source buffer. You need a proper codec and container, and one supported by MSE at that... which is pretty much just H.264 and AAC in ISOBMFF(MP4).

    I plan to add audio later, which is why I'm using MediaSource instead of just updating an image tag or canvas.

    Just because you intend to have an audio track doesn't mean you can't display your image/video track on a canvas.

    Just, don't do any of this. And, there's no need for Socket.IO either. If you really want to shove raw frames to the client like this, then you actually do need to draw them to a canvas. If you don't need raw frames, use a proper codec.

    There are also some new objects in the Web Codecs API that may be useful to you, but they aren't broadly supported yet.