encodeurlencodeurl-encodingunicode-escapespercent-encoding

Percent encoding a non extended ascii char like extended chars


If we percent encode the char "€", we will have %E2%82%AC as result. Ok!

My problem:

a = %61
I already know it.

Is it possible to encode "a" to something like %XX%XX or %XX%XX%XX?
If yes, will browsers and servers understand the result as the char "a"?


Solution

  • If we percent encode the char "€", we will have %E2%82%AC as result.

    is Unicode codepoint U+20AC EURO SIGN. The byte sequence 0xE2 0x82 0xAC is how U+20AC is encoded in UTF-8. %E2%82%AC is the URL encoding of those bytes.

    a = %61
    I already know it.

    For ASCII character a, aka Unicode codepoint U+0061 LATIN SMALL LETTER A, that is correct. It is encoded as byte 0x61 in UTF-8 (and most other charsets), and thus can be encoded as %61 in URLs.

    Is it possible to encode "a" to something like %XX%XX or %XX%XX%XX?

    Yes. Any character can be encoded using percent encoding in a URL. Simply encode the character in the appropriate charset, and then percent-encode the resulting bytes. However, most ASCII non-reserved characters do not require such encoding, just use them as-is.

    If yes, will browsers and servers understand the result as the char "a"?

    In URLs and URL-like content encodings (like application/x-www-webform-urlencoded), yes.