pythonjsonpython-3.xfastapibackslash

Can I return a single backslash in a JSON response in FastAPI?


Seems like a really simple question and it's not clear what the correct answer is here.

We understand that backslashes are a special delimiter in JSON.

From our database we're being returned a field with a backslash in it. It has to be a single backslash for contractual/legal/government representation reasons. Yet it seems to be impossible to return just one single backslash. Wondering if this a rule from JSON? It might be, but 3 of us for spending a day searching can't find out what's going on here.

Here is the FastAPI app:

from fastapi import FastAPI

app = FastAPI()


@app.get("/")
async def root():
    return {
        "backslash_1": " \ ",
        "backslash_2": " \\ ",
        "backslash_3": " \\\ ",
        "backslash_4": " \\\\ ",
        "backslash_5": " \\\\\ ",
        "backslash_6": " \\\\\\ ",
    }

Here's the JSON response:

{
    "backslash_1":" \\ ",  <-- there are 2
    "backslash_2":" \\ ",
    "backslash_3":" \\\\ ",  <-- there are 4
    "backslash_4":" \\\\ ",
    "backslash_5":" \\\\\\ ",  <-- there are 6
    "backslash_6":" \\\\\\ "
}

We're not talking about python r'' or repr() or print, we talking about JSON body response from an API. This question strictly relates to API JSON bodies so these other SO qs aren't useful here:

We've tried all these but this is not really helpful for our API clients users, as they're seeing JSON property value string returned as "ref\\\\official", which is an erroneous response, because the official string should be "ref\official".

The actual advice on whether it's possible or not to return a single slash would be really helpful.


Solution

  • OK, now having asked have found the answer. It is this one:

    https://stackoverflow.com/a/49763737/163567

    Yes, it is impossible -- by design.

    A JSON parser is, by nature, supposed to emit only valid JSON. From RFC 8259, emphasis mine:

    1. Strings

    The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks, except for the characters that MUST be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F).

    Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A through F can be uppercase or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C".

    Alternatively, there are two-character sequence escape representations of some popular characters. So, for example, a string containing only a single reverse solidus character may be represented more compactly as "\".

    Note the phrase "MUST be escaped" -- "MUST" is a formally-defined term-of-art; something which does not comply with a MUST requirement from the JSON specification is not allowed to call itself JSON.

    In summary: A string containing only a literal backslash in your data may be encoded in JSON as "\u005c", or "\". It may not be encoded as "" (including that character as an unescaped literal).