[SOLVED] ASP.NET MVC XSS Input Field strip HTML/Scripts or Sanitize

ASP.NET MVC XSS Input Field strip HTML/Scripts or Sanitize

I'm using ASP.NET MVC AntiXssEncoder to prevent XSS for INPUT fields on Regeneration Form

However, when on Update page user sees below:

Input Test &lt;b&gt;abc&lt;/b&gt;

What's the best practice for this scenario? 1. Sanitize or Remove all HTML and Script Tags

Thanks.

Solution

When you threat model or build a threat profile for the application you are dealing with, it will come clear on the systems (apps, web pages) that you are interfacing and communicating with. You will get to know the places you are receiving the input from, you will get to know the places you are outputting. You can make the decision of whether

you want to sanitize the input and store it in a database
or just store malicious input (remember even if <script>script('foo')</script> is in the database it is still malicious when it gets reflected on the page) in the database, and then prevent it from being executed at the time of display in the browser or web application.

It would not be very apt to give a conclusive answer that you should sanitize the input before string it in a database (such as if the user inputs the string Alex <script>window.location='http://evil.com';</script> then you should store only Alex and purge the malice input from the entered string, and then store it in the database) or do not worry about input sanitization, depend on output encoding.

But the best practice is to implement security in depth, in multiple layers. Ideally, you should be doing input sanitization and output encoding as well. Because you may never know that if you do not do input sanitization, your malicious script in the database may get reflected in some other application that you feed your data too.

Having said all that story, In your case I think what you want to see in the output is instead of Input Test <b>abc</b> you want to see <b>abc</b>. When you look at the response (html source) of the page then you should have Input Test <b>abc</b> which would be displayed in the browser as abc.

However if you see Input Test <b>abc</b> in the browser then I think your response (use the view source option of capture with fiddler) is &lt;b&gt;abc&lt;/b&gt; . If this is the case then you are running in to the problem of double encoding (actually double html output encoding). If you are using razor view engine, then unless you do a Html.Raw ... by default every string you output would be output encoded with the Razor view engine's @ syntax. If you are using aspx view engine, then anything you put inside the script block <%: %> would be output encoded.

Your question about best practice is answered in my opinion. Please comment of edit the question if you want to know anything else (related to this question).