pythonjsonmariadb

Convert stringified json objects to json


I'm using MariaDB, and unfortunately, JSON objects are returned as strings. I want to convert these stringified JSON objects back to JSON, but the problem is - I only want to do so if they are actually JSON objects, and ignore all fields that are, for example, just normal strings (but can convert to JSON without causing an error).

My approach to do this, is to check if the string contains a double-quote, but this seems a bit too naive, because it will also convert a string which just naturally contains a double quote, but isn't intended as a JSON object. Is there a more robust way to achieve this?

import json
results = {"not_string": 1234,
           "not_json": "1234",
           "json": '[{"json": "1234"}]',
           "false_positive": '["actually a quote in brackets"]'}

# load the json fields in the results
for key, value in results.items():
    if isinstance(value, str) and '"' in value:
        try:
            results[key] = json.loads(value)
        except Exception as e:
            pass
for key, value in results.items():
    print(type(value))
<class 'int'>
<class 'str'>
<class 'list'>
<class 'list'>  <--

Expected:

<class 'int'>
<class 'str'>
<class 'list'>
<class 'str'>  <--

basically, I don't want to rely on "asking for forgiveness", because the string can be converted to JSON without causing an error, but this is a false-positive and shouldn't be done.


Solution

  • Sorry for this confusing question, I failed to articulate what the goal of this question was. I want to avoid converting string values to json, even if they would succeed to convert.

    for example, i do not want to convert a string of digits ("1234") to json, but I do want to convert json objects, such as dictionaries, or an array of dictionaries to json.

    What I've come up with is this:

    Such json objects contain at least one double-quote (I already defined this in the question)

    But, they must also contain an even number of double quotes

    And, they should also contain at least one colon -> :

    I've expanded my check if the code should convert to json based on these definitions:

            if isinstance(value, str) and value.count('"') % 2 == 0 and value.count('"') > 0 and ":" in value:
                try:
                    value = json.loads(value)
                except Exception as e:
                    # do not change the value
                    pass
    

    This avoids converting strings which are intended to remain strings, and focuses on converting more complex json objects, such as dictionaries.

    I am aware that this may still produce false-positives, but all of this just depends on my use-case. I'm satisfied with this, because I want to focus on dictionaries, and avoid converting stringified integers and arrays which do not contain dictionaries