javajsonescaping

Unescape String JSON so that escaping inside text remains


There is an escaped String JSON with some escaped chars ("Bob") inside:

{\"id\":\"1\",\"name\":\"cat with \"Bob\" name\"}

It's necessary to correctly unescape it (escaping inside text should remain):

{"id":"1","name":"cat with \"Bob\" name"}

Could you advise, please, how to do this?

I've tried:

JSONObject jsonObject = new JSONObject(jsonStringEscaped); // 1
StringEscapeUtils.unescapeJava(jsonStringEscaped) // 2

but they escape everything, so non-valid JSON is received:

{"id":"1","name":"cat with "Bob" name"}

Solution

  • There is an escaped String JSON with some escaped chars ("Bob") inside:

    Actually, that's not "Some valid JSON that was thrown through a string escaping mechanism". Because if that was the case, the backslashes around the quotes of "Bob" would be paired up. This is an 'escaped String JSON':

    "{\"id\":\"1\",\"name\":\"cat with \\\"Bob\\\" name\"}"
    

    If you had that, you could just do what you did: Toss it through unescapeJava or just parse it as a JSONString. Either one would then give you a string you can give to a JSON Parser and it would give you a JSON object with 2 keys (id and name).

    But you don't have that.

    It's necessary to correctly unescape it (escaping inside text should remain)

    Mathematically impossible. You might as well ask on stack overflow: "I just need to go faster than the speed of light", or "I just need to square this circle using only a straight edge and a compass". It's not so much 'nobody has yet come up with an answer', it's more: "Someone has proven it is in fact an impossible task; no answer will ever exist".

    Could you advise, please, how to do this?

    Go to whomever or whatever got you that string and tell them to fix their bugs. They shouldn't just handroll their JSON stuff; they should use an actual JSON library. Writing JSON, just like writing CSV, is a task where bad programmers tend to think: "Easy, I will just write that myself", and then they mess it up. That's because writing JSON and writing CSV is more difficult than many folks think it is. Thus, you are in the unenviable position of telling somebody they aren't as smart as they thought they were; that's difficult, they might get defensive. But Stack Overflow can't help you with that.

    If they insist on writing their own they should do a much better job; provide them this very string you got as basis for a regression test. Make no mistake, the code that made this is bugged and you cannot work around this bug.

    Wait, mathematically impossible?

    Yes.

    Imagine this actual JSON:

    {"id":"1", "name":"Hello there ", "Bob":"Z"}
    

    Perfectly valid JSON with 3 keys (id, name, and Bob).

    Then imagine this:

    {"id":"1", "name":"Hello there \", \"Bob\":\"Z"}
    

    This is also perfectly valid JSON; with keys id and name. The value associated with name is:

    Hello there ", "Bob": "Z
    

    Which is a pretty weird string, but, a valid string nonetheless.

    Now take those 2 completely different JSON blobs and turn them into your weird 'escaped badly' strings via the broken algorithm the code that made these uses. They would be identical.

    Hence, given that these 2 completely different JSON blobs produce the same 'output', given that output, you have no idea which one of these 2 options was your input.

    Thus, mathematically impossible. Your only option is to fix the source of this mess.