pythonjsonviper

Parsing JSON failed


I am trying to parse this data (from the Viper malware analysis framework API specifically). I am having a hard time figure out the best way to do this. Ideally, I would just do a:

jsonObject.get("SSdeep")

... and I would get the value.

I don't think this is valid JSON unfortunately, and without editing the source of the project, how can I make this proper JSON or easily get these values?

[{
'data': {
    'header': ['Key', 'Value'],
    'rows': [
        ['Name', u 'splwow64.exe'],
        ['Tags', ''],
        ['Path', '/home/ubuntu/viper-master/projects/../binaries/8/e/e/5/8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781'],
        ['Size', 125952],
        ['Type', 'PE32+ executable (GUI) x86-64, for MS Windows'],
        ['Mime', 'application/x-dosexec'],
        ['MD5', '4b1d2cba1367a7b99d51b1295b3a1d57'],
        ['SHA1', 'caf8382df0dcb6e9fb51a5e277685b540632bf18'],
        ['SHA256', '8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781'],
        ['SHA512', '709ca98bfc0379648bd686148853116cabc0b13d89492c8a0fa2596e50f7e4d384e5c359081a90f893d8d250cfa537193cbaa1c53186f29c0b6dedeb50d53d4d'],
        ['SSdeep', ''],
        ['CRC32', '7106095E']
    ]
},
'type': 'table'
}]

Edit 1 Thank you! So I have tried this:

        jsonObject = r.content.replace("'", "\"")
        jsonObject = jsonObject.replace(" u", "")

and the output I have now is:

"[{"data": {"header": ["Key", "Value"], "rows": [["Name","splwow64.exe"], ["Tags", ""], ["Path", "/home/ubuntu/viper-master/projects/../binaries/8/e/e/5/8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781"], ["Size", 125952], ["Type", "PE32+ executable (GUI) x86-64, for MS Windows"], ["Mime", "application/x-dosexec"], ["MD5", "4b1d2cba1367a7b99d51b1295b3a1d57"], ["SHA1", "caf8382df0dcb6e9fb51a5e277685b540632bf18"], ["SHA256", "8ee5b228bd78781aa4e6b2e15e965e24d21f791d35b1eccebd160693ba781781"], ["SHA512", "709ca98bfc0379648bd686148853116cabc0b13d89492c8a0fa2596e50f7e4d384e5c359081a90f893d8d250cfa537193cbaa1c53186f29c0b6dedeb50d53d4d"], ["SSdeep", ""], ["CRC32", "7106095E"]]}, "type": "table"}]"

and now I'm getting this error:

  File "/usr/lib/python2.7/json/decoder.py", line 369, in decode
    raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 5 - line 1 column 716 (char 4 - 715)

Note: I'd really rather not do the find and replaces like that.. especially the " u" one, as this could have unintended consequences.

Edit 2: Figured it out! Thank you everyone!

Here's what I ended up doing, as someone mentioned the original text from the server was a "list of dicts":

        r = requests.post(url, data=data) #Make the server request
        listObject = r.content #Grab the content (don't really need this line)
        listObject = listObject[1:-1] #Get rid of the quotes 
        listObject = ast.literal_eval(listObject) #Create a list out of the literal characters of the string
        dictObject = listObject[0] #My dict! 

Solution

  • JSON specifies double quotes "s for strings, from the JSON standard

    A value can be a string in double quotes, or a number, or true or false or null, or an object or an array.

    So you would need to replace all the single quotes with double quotes:

    data.replace("'", '"')
    

    There is also a spurious u in the Name field that will need to be removed.
    However if the data is valid Python and you trust it you could try evaluating it, this worked with your original data (without the space after the u):

    result = eval(data)
    

    Or more safely:

    result = ast.literal_eval(data)