Follow up to Can I use an at symbol (@) inside URLs?
Based on the top voted answer, the @
is not a reserved character in the URL path (although it is in the host).
However, given an @
in the path, is the URL-encoded form interchangeable? In other words, is twitter.com/@user
strictly equivalent to twitter.com/%40user
?
In practice it seems like they're often used interchangeably, but curious if that is strictly the case (e.g. AbC@gmail.com
is technically different from abc@gmail.com
, but nearly everyone treats them the same).
More broadly, when do characters and there URL-encoded version need to be treated the same, and when different (e.g. example.com/path%2Fasdf
is NOT the same as example.com/path/asdf
) …
The URIs http://twitter.com/@user
and http://twitter.com/%40user
are not equivalent.
The URI standard is STD 66, which currently maps to RFC 3986 (which updates RFC 1738).
The section 6.2.2.2. Percent-Encoding Normalization defines how to normalize percent-encoded URIs to compare them for equivalence (after uppercasing hexadecimal digits A
-F
, as defined by 6.2.2.1 Case Normalization).
It says:
[…] some URI producers percent-encode octets that do not require percent-encoding, resulting in URIs that are equivalent to their non-encoded counterparts. These URIs should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character, as described in Section 2.3.
The linked section 2.3 lists the unreserved characters, which are:
a
-z
, A
-Z
)0
-9
)-
.
_
~
This sections also states that, even in case no normalization happens:
URIs that differ in the replacement of an unreserved character with its corresponding percent-encoded US-ASCII octet are equivalent: they identify the same resource.
The @
is not part of the "unreserved" set. It’s part of the "reserved" set, where it says:
URIs that differ in the replacement of a reserved character with its corresponding percent-encoded octet are not equivalent.