javascriptajaxcharacter-encodingform-dataxmlhttprequest-level2

Is it possible to set accept-charset for new FormData (XHR2) object or workaround


Here is example code (http://jsfiddle.net/epsSZ/1/):

HTML:

<form enctype="multipart/form-data" action="/echo/html" method="post" name="fileinfo" accept-charset="windows-1251">
  <label>Label:</label>
  <input type="text" name="label" size="12" maxlength="32" value="får løbende" /><br />
  <input type="submit" value="Send standart">
</form>
<button onclick="sendForm()">Send ajax!</button>

JS:

window.sendForm = function() {
  var oOutput = document.getElementById("output"),
     oData = new FormData(document.forms.namedItem("fileinfo"));
  var oReq = new XMLHttpRequest();
  oReq.open("POST", "/echo/html", true);
  oReq.send(oData);
}

When i submit this old way via standart form submit, then request payload looks like this:

------WebKitFormBoundary2890GbzEKCmB08rz
Content-Disposition: form-data; name="label"

f&#229;r l&#248;bende

But when i submit this AJAX way, then it looks little different:

------WebKitFormBoundaryPO2mPRFKj3zsKVM5
Content-Disposition: form-data; name="label"

får løbende

As you can see, in former case there is some chars is replaced with character entities, but in case of using FormData there is plain string, which is of course good because it's utf-8, but is there any possibility to make it behave like standart form submit ?


Solution

  • The answer to your question is No. You cannot change it. According to XMLHttpRequest2 TR, FormData constructed data is explicitly encoded to UTF-8. With no mention of allowing to change it.

    The usual mimeType or Content-Type=charset become invalid for multi-part requests, since it is handled differently for the exact same reason.

    To quote,

    If data is a FormData Let the request entity body be the result of running the multipart/form-data encoding algorithm with data as form data set and with UTF-8 as the explicit character encoding.

    Let mime type be the concatenation of "multipart/form-data;", a U+0020 SPACE character, "boundary=", and the multipart/form-data boundary string generated by the multipart/form-data encoding algorithm.

    Hope this helps!

    Update

    If you are willing to forgo

    new FormData(document.forms.namedItem("fileinfo"));

    for

    new FormData().append("name", "value")

    there might be a workable solution possible. Let me know if thats what you are looking for.

    Another Update

    Did a little bit of running around. Updated fiddle with all modes

    So this is the story,

    1 form with accept-charset="utf8" => default behavior

    The content does not require any additional escaping/encoding. So the request fires with the text intact as får løbende

    2 form with accept-charset="windows-1251" => your case

    The content requires additional escaping/encoding, since the default charset of the browser here is utf8. So the content is escaped, and then fired, i.e. the content sent is f&#229;r l&#248;bende

    3 FormData constructed with form element

    The content does not require any additional escaping/encoding, since it defaults to utf8. So the request fires with text as får løbende.

    4 FormData constructed, and then appended with escaped data

    The content is still in the utf8 encoding, but it doesn't hurt to call escape(content) before appending to the form data. This means the request fires with text as f%E5r%20l%F8bende. Still no dice right?

    I was wrong, nope. Looking closer[read => staring for a few minutes....] at

    f&#229;r l&#248;bende and

    f%E5r%20l%F8bende

    Then it all fell into place - %E5 (Hexadecimal) = &#229; (Decimal). So basically escape()is Javascript's way of doing things, the % based encoding, which is not HTML friendly.

    Similarly &#;, as we know is HTML's way of encoding. So I put another mode to ajax, [which is what you are looking for, I'm guessing]

    5 FormData constructed, and then appended with html-escaped data

    The content is still in utf8 encoding. Doesn't hurt to escape it like HTML encoding, using this wonderful piece of code from stackoverflow. And voila, the request fired with the text f&#229;r l&#248;bende

    Updated fiddle with all modes

    Hope this helps clear it out!

    UPDATE for windows-1251 full support

    This привет får løbende input was failing in earlier mode 5. Update fiddle http://jsfiddle.net/epsSZ/6/.

    Uses a combination of solution here https://stackoverflow.com/a/2711936/1304559 and mine. So the problem is escaping everything. So now escaping only characters not present in the windows-1251 charset.

    This helps it I hope!