javaxssowasp

OWASP sanitizer generates unexpected results


I am using OWASP sanitizer to do some clean for the input data. Below is the Policy I used

             return new HtmlPolicyBuilder()
                    .allowElements("a", "label", "h1", "h2", "h3", "h4", "h5", "h6",
                            "p", "i", "b", "u", "strong", "em", "small", "big", "pre", "code",
                            "cite", "samp", "sub", "sup", "strike", "center", "blockquote",
                            "hr", "br", "col", "font", "span", "div", "img",
                            "ul", "ol", "li", "dd", "dt", "dl", "tbody", "thead", "tfoot",
                            "table", "td", "th", "tr", "colgroup", "fieldset", "legend")
                    .allowAttributes("src", "alt", "align", "title", "hspace", "vspace").onElements("img")
                    .allowAttributes("href", "target").onElements("a")
                    .allowAttributes("border", "cellpadding", "cellspacing", "style", "class").onElements("table")
                    .allowAttributes("colspan", "rowspan", "style", "class", "align", "valign").onElements("td")
                    .allowAttributes("border", "height", "width").globally()
                    .allowStandardUrlProtocols()
                    .requireRelNofollowOnLinks()
                    .allowStyling()
                    .toFactory();

So when my input is <a>test</a>, I expect it will return me the same result, cause I am allowing "a" tag. However, it returns "test" only. Here is my gradle

compile group: 'com.googlecode.owasp-java-html-sanitizer', name: 'owasp-java-html-sanitizer', version: '20191001.1'

I also tried disallowAttributes("script"). It didn't work either. Any ideas? Thanks.


Solution

  • So when my input is test, I expect it will return me the same result, cause I am allowing "a" tag. However, it returns "test" only.

    Some elements are disallowed by default if they do not have attributes provided. Just add the following clause to the policy builder:

    .allowWithoutAttributes("a")
    

    It would remove the a element from the internal HtmlPolicyBuilder.skipIfEmpty instance set that skips a, font, img, input, and span by default (see HtmlPolicyBuilder.DEFAULT_SKIP_IF_EMPTY).