htmlvideoserverhtml5-videochunking

How does loading HTML5 video chunks actually work?


Basically my understanding is this: whenever a video player is playing media, it is downloading it in chunks, defined by the RANGE header. The server serves only the bytes requested from the file.

What happens though, when I open an mp4 file in chromium is this:

enter image description here

Here are the content-related response headers on each request, along with the RANGE request header, in the same order:

---
accept-ranges: bytes
content-encoding: gzip
content-type: video/mp4

range: bytes=0-
---
accept-ranges: bytes
Content-Length: 100450390
Content-Range: bytes 0-100450389/100450390
content-type: video/mp4

range: bytes=100237312-
---
accept-ranges: bytes
Content-Length: 213078
Content-Range: bytes 100237312-100450389/100450390
content-type: video/mp4

range: bytes=32768-
---
accept-ranges: bytes
Content-Length: 100417622
Content-Range: bytes 32768-100450389/100450390
content-type: video/mp4

range: bytes=6127616-
---
accept-ranges: bytes
Content-Length: 94322774
Content-Range: bytes 6127616-100450389/100450390
content-type: video/mp4

From what I'm seeing, I'm understanding that the second request requests the entire file, the third requests 200kb off of the end of the file, the fourth requests the file, starting from 32kb in, and the last requests the file starting from the 6th-ish megabyte.

This alone looks like a bunch of chaos is going on, but the confusion is further continued by the screenshot in the beginning of the question, which shows that half of the responses don't match the requested sizes?

Furthermore, the last request is alive from starting the video up until the very end, which is 3 minutes long, isn't it bad for the server to keep a request alive for so long, idling for a large portion of its lifespan (I'm talking about apache in particular)?

I guess if I have to summarize my question, I'd like to understand what is going on and to make sense of all of this.


Solution

  • Each video has a MOOV atom that describes the video.

    Most encoders put that at the front of the mp4, so when you download the video - it's right there, first thing - and the player can decode the information about the video.

    However, some encoders place the MOV at the end of the file. Rather than download all 100MB of the video to get to the MOOV, the player uses range requests.

    1. Does the video exist on the server? (40b - yes!)
    2. First 32 KB has no moov... maybe at the end?
    3. Last 200 KB - Is there a moov? Yes - ok we can continue
    4. Ok we have 0-32KB (from step 2) .. let's resume 32 KB ->