javascriptc#asp.netvb.netwebforms

Encode user input to prevent web page errors


99% of answers on this issue talk about how to sanitize input. I actually don't care in this case - the page is secured with users' logons.

All I want is to prevent page errors - I really don't care what the user types in.

This multi-line text box in theory means that users will cut + paste from Word, or some other editor with encodings.

Since a simple "<" or "[" the text can BLOW out the string entered by the user.

I of course serialize a few values into a JSON string.

Hence, I've been doing this:

            // transfer user comments box to Pinfo
            // but, user might type in crap like "<" etc. - not allowed

JavaScript:

            var MyUpLoadInfo = document.getElementById("MyUpLoadInfo")  // client side json pinfo
            var MyNotes = document.getElementById("txtNotes")
            Pinfo = JSON.parse(MyUpLoadInfo.value)
            // convert string to base64 - too much quotes and crap!!!
            var tnote = cleanup(MyNotes.value)
            Pinfo.txtNotes = window.btoa(tnote)

So, I've been using btoa to convert the string.

Server side, then I simple de-code that string with this:

    Dim sConvert As New UTF8Encoding

   cPinfo2.txtNotes = 
       sConvert.GetString((Convert.FromBase64String(cPinfo2.txtNotes)))

So, this works MOST of the time, and thus I do NOT care if the user types in say a "<", or a ":" or whatever.

However, I still every few days still get this error as a result of above:

Error Description:
The input is not a valid Base-64 string as it contains a non-base 64 character, 
more than two padding characters, or an illegal character among the padding
characters. 

I'm even happy to convert to a hex string.

Is there a simple encoding trick I can use client side, and then de-code server side that will prevent such errors?

I mean, I am thinking one solution would be to drop in a html editor, such as ckeditor, or use the ajax html editor I use for editing some content on the site.

Anyway, th goal is to prevent these errors, and I don't care if due to a simple cut + paste (say from word) that a few control characters and junk is included here.

I can and will consider stripping out any non printable characters, or even using + calling a sanitizer routine, but I want that text box content sent up to the server - by ANY MEANS possible!

As noted, the text box, and serialized to a json string works quite well, but when people started typing in things like

  [only this or that--

Then the open but not closed "]" things would blow out the JSON string (and de-serialize server side).

So, client side (js)

MyUpLoadInfo.value = JSON.stringify(Pinfo)

In fact, it really was the server side code to un-cork the above with:

    cPinfo2 = js.Deserialize(e.MyUpLoadInfo, GetType(ControlMIS.clsProjectInfo))

That would often error out (stray JSON values (characters like : or [ etc.) in the text box).

Anyway, the issue of the JSON string not a huge deal, but some type of "converting" to that user input into a hex string or ANYTHING at all at this point in time would suffice as a solution.

So, I can't seem to pin down what actually users are typing in, or is this is just some simple stray control characters in their text - but as I stated, I don't really care, I simple want that text string - whatever it may be.

Any good ideas on how to encode that string into something that will make the trip back up to the server?

Perhaps even encoding the string into a hex string?

Edit: My solution based on answer

OK, so really the issue for me is due to the json.stringify. I need to pack up a few values AND ALSO the text box string. So, several values in that JSON string, and ALSO the text notes are to be passed back to the server.

That simple text Notes string was blowing up the json.stringfy, since stray "["or whatever would cause havoc with the json string.

So, proof of concept code (without the JSjsonON string just yet) is this:

        <h3>Input box</h3>
        <asp:TextBox ID="TextBox1" runat="server" TextMode="MultiLine"
            Height="169px" Width="510px" ClientIDMode="Static"></asp:TextBox>

        <br />
        <asp:Button ID="Button1" runat="server" Text="client side Enocde"
            OnClientClick="myencode();return false"
            />
        <br />
        <asp:Button ID="Button2" runat="server" Text="client side decode"
            OnClientClick="mydecode();return false"
            />
        <br />
        <br />

        <h3>Encoded text</h3>
        <asp:TextBox ID="TextBox2" runat="server" TextMode="MultiLine" ClientIDMode="Static"
            Height="169px" Width="510px"></asp:TextBox>
        <br />
        <br />

         <h3>Server side de-coded text</h3>
        <asp:TextBox ID="TextBox3" runat="server" TextMode="MultiLine" ClientIDMode="Static"
            Height="169px" Width="510px"></asp:TextBox>
        <br />
        <asp:Button ID="Button3" runat="server" Text="SERVER decode"
            OnClick="Button3_Click"
            />

The JS code:

    <script>
        function myencode() {

            var tbox1 = document.getElementById("TextBox1")  
            var tbox2 = document.getElementById("TextBox2")  
            var sText = encoder(tbox1.value)
            tbox2.value = sText
        }
        function mydecode() {

            var tbox1 = document.getElementById("TextBox1")  // client side json pinfo
            var tbox2 = document.getElementById("TextBox2")  // client side json pinfo
            var sText = decodeURIComponent(atob(tbox2.value));
            alert(sText)
            tbox1.value = sText
        }

        function encoder(s) {
            var sHex = btoa(encodeURIComponent(unescape(s)));
            return sHex
        }
    </script>

And the server side de-code:

Protected Sub Button3_Click(sender As Object, e As EventArgs)

    Dim sConvert As New UTF8Encoding
    Dim sBuf As String =
        sConvert.GetString(Convert.FromBase64String(TextBox2.Text))

    Dim strMyResult As String =
        Uri.UnescapeDataString(sBuf)

    TextBox3.Text = strMyResult

End Sub

And the result is this:

enter image description here

So, I will now then try this "proof" of concept in production.

It is ONLY the one value (txtNotes) out of a JSON string that was causing issues with my code - and "rare" this would occur, but I never could track down what characters or what the user input that was causing issues here.

The above is only a test of the encode and then taking that text server side and un-encoding. The actual production code is of course a serialized JSON object, being sent to server side via a ajax call. However, the basic concept of "encoding" the text box input as a string is the "gold" part of the moving parts here.


Solution

  • If this is good enough for the formation of image data URLs, it should be good enough for your needs:

     S=btoa(encodeURIComponent(unescape(S)));
    

    To get the original string back...

     S=decodeURIComponent(atob(S));
    

    Good luck with everything.