htmlpurifier

How to add attributes to elements with HtmlPurifier?


I'm looking to purify HTML with the HtmlPurifier package and add attributes to certain elements. Specifically, I'd like to add classes to <div> and <p> elements so that this:

<div>
    <p>
        Hello
    </p>
</div>

Gets purified/transformed into this:

<div class="div-class">
    <p class="p-class">
        Hello
    </p>
</div>

How would one go about doing this with HtmlPurifier? Is it possible?


Solution

  • I believe you could do this by doing something along these lines (though please treat this as pseudocode, the last time this scenario worked for me was years ago):

    class HTMLPurifier_AttrTransform_DivClass extends HTMLPurifier_AttrTransform
    {
        public function transform($attr, $config, $context) {
            $attr['class'] = 'div-class';
            return $attr;
        }
    }
    
    class HTMLPurifier_AttrTransform_ParaClass extends HTMLPurifier_AttrTransform
    {
        public function transform($attr, $config, $context) {
            $attr['class'] = 'p-class';
            return $attr;
        }
    }
    
    $htmlDef = $this->configuration->getHTMLDefinition(true);
    $div     = $htmlDef->addBlankElement('div');
    $div->attr_transform_post[] = new HTMLPurifier_AttrTransform_DivClass();
    $para    = $htmlDef->addBlankElement('p');
    $para->attr_transform_post[] = new HTMLPurifier_AttrTransform_ParaClass();
    

    Remember to allowlist the class attribute for div and p as well, if you haven't already.

    That said, at first glance, HTML Purifier doesn't seem to be the right place for this kind of logic, since adding class names isn't relevant for the security of your site (or is it?). If you're already using HTML Purifier to allowlist your HTML tags, attributes and values, and just want to leverage its HTML-parsing capabilities for some light-weight additional DOM manipulation, I see no particular reason not to. :) But it might be worth reflecting on whether you want to add the classes using some other process (e.g. in the frontend, if that's relevant for your use case).