phpmathmlhtmlpurifier

PHP HTML Purifier and MathML


Is there any simple way to allow all MathML tags with attributes in HTML Purifier?

I tried to put all the MathML tags from https://developer.mozilla.org/en-US/docs/Web/MathML/Element/semantics with attributes to HTML.Allowed but I don't know if this is the right way.


Solution

  • There's currently no native support for MathML in HTML Purifier. There's an old pull request you could potentially repurpose here, but as it's a few years old patching it in will almost surely require significant manual effort; see also some discussion here:

    The primary consideration is security. When adding a very big new extension like MathML, it is very tempting to cut corners, and not truly understand every corner of the specification and build a parser that truly understands what it reads, and isn't just checking syntax blindly.

    Alternatively you could use the customization guide to add them as new tags and attributes to HTML Purifier, but that's more work, not less.

    Simply adding the tags to HTML.Allowed won't do much - HTML Purifier's strength is that it understands the context that tags appear in, where they're allowed to appear and what restrictions make sense on their attributes (e.g. an attribute like 'width' takes integers, but an attribute like 'style' takes CSS (that will be sanitised separately), and an attribute like 'onclick' is unsafe by definition). If HTML Purifier doesn't know anything about a particular tag, it won't allow it, even if you add it to the allowlist, because it won't know how to actually handle the tag.

    In short:

    No, there is unfortunately no simple way to allow MathML in HTML Purifier.