javaxssapache-commonsjsouphtml-escape-characters

Apache Commons Text StringEscapeUtils vs JSoup for XSS prevention?


I want to clean user input for help preventing XSS attacks and we don't necessarily care to have a HTML whitelist, as our users shouldn't need to post any HTML / CSS.

Eyeing the alternatives out there, which would be better? Apache Commons Text's StringEscapeUtils or JSoup Cleaner?

Update:

I went with JSoup after writing some unit tests for both it and Apache Commons Text.

I like how JSoup won't mess with single quotation marks (i.e. "Alan's mom" isn't unchanged, whereas Apache Commons Text turns it into "Alan's mom").

And the whitelist wasn't a problem at all. It didn't require any configuration, rather, they have some built-in options included which may come in handy if we choose to allow some subsets of HTML tags.


Solution

  • "Better"? I don't think it matters. Cleaner has a Whitelist.none(), escape utils will escape everything.

    It depends on how you want the "cleaned" input to render: do you want just the text nodes, or do you want the escaped HTML to show up?