How to properly encode query parameters. I pass the data through Anti Samy which does the clean up. Followed by that I need to pass the cleaned up string unescapeHtml4 method. Because sometimes I have Json as my request parameter(see example 1 below).
Currently, my code is as follows:
String str = StringEscapeUtils.unescapeXml(xml);
ClearResult cr = antiSamy.scan(str);
String cleanStr = cr.getCleanHTML();
String s = StringEscapeUtils.unescapeHtml4(cleanStr);
Example 1: Json as the request:
if I remove line number 4, I'll end up getting {" name ": " Mav "}
Example 2: Escaped result of the following script: < script >alert("hi") < script >
If I keep the line number 4, I'm vulnerable to XSS.
How to resolve this issue ? I want to handle both JSON and HTML request parameters. Any help would be appreciated.
Treat parameters based on what content they are supposed to have / what you are planning to do with the data.
When a parameter contains json that you will decode, then don't use antisamy on it.
When a parameter contains text that you will output as html, then use antisamy on it.
When a parameter contains json that in itself contains some values that you will output as html, then first decode the parameter as json and then use antisamy on the individual values from the json that you plan to output as html.
Antisamy has a special purpose. To deal with text that you want to allow to contain HTML tags. Do not use antisamy on any other data. Instead make sure html syntax is escaped during outputing any text received from a user that you do not explicitly want to allow to contain HTML markup.