Seems like a really simple question and it's not clear what the correct answer is here.
We understand that backslashes are a special delimiter in JSON.
From our database we're being returned a field with a backslash in it. It has to be a single backslash for contractual/legal/government representation reasons. Yet it seems to be impossible to return just one single backslash. Wondering if this a rule from JSON? It might be, but 3 of us for spending a day searching can't find out what's going on here.
Here is the FastAPI app:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
async def root():
return {
"backslash_1": " \ ",
"backslash_2": " \\ ",
"backslash_3": " \\\ ",
"backslash_4": " \\\\ ",
"backslash_5": " \\\\\ ",
"backslash_6": " \\\\\\ ",
}
Here's the JSON response:
{
"backslash_1":" \\ ", <-- there are 2
"backslash_2":" \\ ",
"backslash_3":" \\\\ ", <-- there are 4
"backslash_4":" \\\\ ",
"backslash_5":" \\\\\\ ", <-- there are 6
"backslash_6":" \\\\\\ "
}
We're not talking about python r''
or repr()
or print
, we talking about JSON body response from an API. This question strictly relates to API JSON bodies so these other SO qs aren't useful here:
We've tried all these but this is not really helpful for our API clients users, as they're seeing JSON property value string returned as "ref\\\\official"
, which is an erroneous response, because the official string should be "ref\official"
.
The actual advice on whether it's possible or not to return a single slash would be really helpful.
OK, now having asked have found the answer. It is this one:
https://stackoverflow.com/a/49763737/163567
Yes, it is impossible -- by design.
A JSON parser is, by nature, supposed to emit only valid JSON. From RFC 8259, emphasis mine:
- Strings
The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).
Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A through F can be uppercase or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".
Alternatively, there are two-character sequence escape representations of some popular characters. So, for example, a string containing only a single reverse solidus character may be represented more compactly as "\".
Note the phrase "MUST be escaped" -- "MUST" is a formally-defined term-of-art; something which does not comply with a MUST requirement from the JSON specification is not allowed to call itself JSON.
In summary: A string containing only a literal backslash in your data may be encoded in JSON as "\u005c", or "\". It may not be encoded as "" (including that character as an unescaped literal).