I'm trying to parse json string with an escape character (Of some sort I guess)
{
"publisher": "\"O'Reilly Media, Inc.\""
}
Parser parses well if I remove the character \"
from the string,
the exceptions raised by different parsers are,
json
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 17 column 20 (char 392)
ujson
ValueError: Unexpected character in found when decoding object value
How do I make the parser to escape this characters ?
update: ps. json is imported as ujson in this example
This is what my ide shows
comma is just added accidently, it has no trailing comma at the end of json, json is valid
the string definition.
You almost certainly did not define properly escaped backslashes. If you define the string properly the JSON parses just fine:
>>> import json
>>> json_str = r'''
... {
... "publisher": "\"O'Reilly Media, Inc.\""
... }
... ''' # raw string to prevent the \" from being interpreted by Python
>>> json.loads(json_str)
{u'publisher': u'"O\'Reilly Media, Inc."'}
Note that I used a raw string literal to define the string in Python; if I did not, the \"
would be interpreted by Python and a regular "
would be inserted. You'd have to double the backslash otherwise:
>>> print '\"'
"
>>> print '\\"'
\"
>>> print r'\"'
\"
Reencoding the parsed Python structure back to JSON shows the backslashes re-appearing, with the repr()
output for the string using the same double backslash:
>>> json.dumps(json.loads(json_str))
'{"publisher": "\\"O\'Reilly Media, Inc.\\""}'
>>> print json.dumps(json.loads(json_str))
{"publisher": "\"O'Reilly Media, Inc.\""}
If you did not escape the \
escape you'll end up with unescaped quotes:
>>> json_str_improper = '''
... {
... "publisher": "\"O'Reilly Media, Inc.\""
... }
... '''
>>> print json_str_improper
{
"publisher": ""O'Reilly Media, Inc.""
}
>>> json.loads(json_str_improper)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/mj/Development/Library/buildout.python/parts/opt/lib/python2.7/json/decoder.py", line 382, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting , delimiter: line 3 column 20 (char 22)
Note that the \"
sequences now are printed as "
, the backslash is gone!