xmlmlt

MLT/XML: optimizing repetitive attribute lists in a tag


I'm working on a project that will produce MLT files based on user input. A typical case may produce an MLT file containing thousands of nearly-similar filters. Each filter will have 11 attributes, but only 3 attributes will change from filter to filter, as you can see in the example below. Is there any way to reduce the repetition of all this data?

<?xml version='1.0' encoding='utf-8'?>
<mlt>
  <profile width="1920" height="1080"/>
  <producer mlt_service="color"
               resource="black"
                     in="0"
                    out="89"/>
  <filter mlt_service="text"
             geometry="1.354%/7.407%:78.125%x77.407%:100"
               family="Nimbus Sans L"
                 size="1000"
             fgcolour="white"
             bgcolour="0"
               halign="right"
               valign="middle"
             argument="1"
                   in="0"
                  out="29"/>
  <filter [...same first 8 attribute settings...]
             argument="2"
                   in="30"
                  out="59"/>
  <filter [...same first 8 attribute settings...]
             argument="3"
                   in="60"
                  out="89"/>

  [...thousands more similar filters...]

</mlt>

I know the SVG specification provides a <defs> tag with the xlink namespace, which allows for things like this:

<svg xmlns="http://www.w3.org/2000/svg"
     xmlns:xlink="http://www.w3.org/1999/xlink">
  <defs>
    <tag id="my-tag" [...reusable attributes...] />
  </defs>
  <use xlink:href="#my-tag" [...instance-specific attributes...] />
</svg>

Something like that would be useful in my case. Is there anything similar in MLT? If not, is there a way to modify the DTD to implement something similar?

I tried using XML entities, but IIUC those only work within individual quoted attribute values, or inside the content between the start and end tags of an individual element.

Any guidance is appreciated.

Thanks


Solution

  • I found a simple workaround that suffices in my case. I'm including it here in case anyone else finds this solution useful.

    In the MLT file, I substituted each redundant attribute list with a non-validating dummy attribute (global="" in my example below). Then I wrote a sed script that expands each dummy attribute into the original attribute list. I can use the sed script to transform the abbreviated MLT file into a valid XML file just before calling melt.

    This works for me because I can store a large number of MLT files without wasting disk space, and I don't actually need any of them to be valid XML, except when I'm running one of them through melt. So if I need to run an MLT file through melt, I'll just create a temporary valid version of it using the sed script, run that through melt, and then delete the temporary version when I'm done.

    It's worth mentioning that sed is absurdly fast. It successfully processed a test file containing 100,000 dummy attributes in less than half a second. Also, using this method cuts my MLT file sizes in half, so it's definitely worth it, especially if disk space is a concern, and if there's a need to be able to store a lot of MLT files.

    For example, file.mlt:

    <?xml version='1.0' encoding='utf-8'?>
    <mlt>
      <profile width="1920" height="1080"/>
      <producer mlt_service="color" resource="black" in="0" out="89"/>
    
      <filter global="" argument="1" in="0"  out="29"/>
      <filter global="" argument="2" in="30" out="59"/>
      <filter global="" argument="3" in="60" out="89"/>
    
      [...thousands more similar filters...]
    
    </mlt>
    

    expand-global-attributes.sed:

    s/global=""/mlt_service=\"text\" family=\"Nimbus Sans L\" [...] /
    

    Here's the command I used at the terminal:

    sed -f expand-global-attributes.sed file.mlt > tmp.mlt && \
      melt tmp.mlt -consumer avformat:file.mp4 && \
      rm tmp.mlt