I'm using the WebCodecs AudioDecoder to decode OGG files (vorbis and opus). The codec string setting in the AudioDecoder
configuration is vorbis
and opus
, respectively.
I have the container parsed into pages, and the AudioDecoder
is almost ready for work.
However, I'm unable to figure out the description
field it's expecting. I've read up on Vorbis WebCodecs Registration, but I'm still lost. That is:
let decoder = new AudioDecoder({ ... });
decoder.configure({
description: "", // <----- What do I put here?
codec: "vorbis",
sampleRate: 44100,
numberOfChannels: 2,
});
Edit: I understand it's expecting key information about how the OGG file is structured. What I don't understand is what goes there exactly. How does the string even look? Is it a dot-separated string of arguments?
https://www.w3.org/TR/webcodecs-vorbis-codec-registration/#audiodecoderconfig-description
AudioDecoderConfig.description
is required. It is assumed to be in Xiph extradata format, described in [OGG-FRAMING]. This format consists in thepage_segments
field, followed by thesegment_table
field, followed by the three Vorbis header packets, respectively the identification header, the comments header, and the setup header, in this order, as described in section 4.2 of [VORBIS].
https://www.w3.org/TR/webcodecs-opus-codec-registration/#audiodecoderconfig-description
AudioDecoderConfig.description
can optionally set to an Identification Header, described in section 5.1 of [OPUS-IN-OGG].If an
AudioDecoderConfig.description
has been set, the bistream is assumed to be inogg
format.If an
AudioDecoderConfig.description
has not been set, the bitstream is assumed to be inopus
format.
If you want a good explanation of how the OGG/Opus header is structured, [OPUS-IN-OGG] is quite instructive.
The OGG/Vorbis header is a bit more vague, there is no documentation on what Xiph extra-data is, so one can only trust the W3 docs on how it is structured, and compare to the OGG/Vorbis docs on the fields ([OGG-FRAMING]).
Essentially, you need to provide the decoder with the relevant binary data headers for the file you are decoding, as ArrayBuffer
, TypedArray
, or DataView
. You can get this from the binary file contents you are decoding.
Unfortunately, to get at this data, you will likely need to parse the format of the underlying OGG container. The WebCodecs API is intended for low-level use, that is, for handling codecs, not the containers themselves. See this GitHub issue where someone runs into similar issues regarding descriptions, and is told to parse the container themselves. Parsing the container is outside of the scope of this API.
Perhaps you could use an external OGG parsing library, or opt for a higher level audio processing class like the WebAudio API or a WebAssembly library?
UPDATE:
To clarify on what should go into a description, the description field is passed directly to FFmpeg's extradata
in Chromium.
The docs specify that for OGG/Opus, you should set the description to be the contents of the 0th page, that is, the identification header (in binary).
For OGG/Vorbis, the documentation is pretty bad, and quite vague. I'll be checking the FFmpeg source for this. It seems to be the identification header, followed by the setup header (as the third header, so the non-optional comment header would be inbetween)
So, to summarise what should go in the description field, you should put the binary contents of the headers of the relevant codec. For OGG/Opus, you would provide the binary contents of the first page (the identification header) For OGG/Vorbis, you would provide the binary contents of the first three packets (the identification header, comment header, and setup header).
The documentation suggests codec-parser
provides the data for the headers as OpusHeader.data
and VorbisHeader.{data,comments,setup}
.
Try concatenating the three together, for Vorbis, and see if that works. (note that comments and setup are not initialized at the same time as data)
// opus
let desc = hdr.data;
// vorbis
let desc = new Uint8Array(hdr.data.length + hdr.comments.length + hdr.setup.length);
desc.set(hdr.data);
desc.set(hdr.comments, hdr.data.length);
desc.set(hdr.setup, hdr.data.length + hdr.comments.length);