pythonjsonencoder

How to alter python's json encoder to convert NaN to None


Say I have this (all this is Python 3.11 in case that matters):

not_a_number = float("NaN")  # this actually comes from somewhere else
json.encode([not_a_number])

The output is an (invalid) JSON literal NaN. I've been trying to create an JSONEncoder subclass that would use math.isnan() to determine if the value is a NaN and output None instead.

I first tried subclassing JSONEncoder and doing it in default(), which I found later isn't called for things like float. I then found a recommendation to override the encode() method instead, so I tried this:

class NanEncoder(json.JSONEncoder):
    def encode(self, obj):
        if isinstance(obj, float):
            if math.isnan(obj):
                return None
        return super(NanEncoder, self).encode(obj)

This works:

>>> json.dumps(not_a_number, cls=NanEncoder)
>>> json_string = json.dumps(not_a_number, cls=NanEncoder)
>>> print(json_string)
None

Cool, I think I've got it. BUT, this does not work:

not_a_number_list = [not_a_number]
print(not_a_number_list)
[nan]
json_string = json.dumps(not_a_number_list, cls=NanEncoder)
print(json_string)
[NaN]

So, as I see in the python docs, maybe I need to call the encode method slightly differently, so I try that:

json_string = NanEncoder().encode(not_a_number_list)
print(json_string)
[NaN]

Alas, no difference.

So, here's my question: is it possible to create a JSONEncoder subclass that will find instances of the float that is NaN in Python and output None instead? Or am I relegated to do a search/replace on the string NaN with null in the output JSON (which, theoretically anyway, could alter data I don't want to)? Fixing the input dictionary is not a great option because the dict that the values are in is quite large and it's construction is not under my control (so I can't stop NaN from getting in there in the first place).


Solution

  • I don't know about easy solution to replace NaN with custom values (other than replace NaNs in original object).

    On top of that, json module doesn't make it easy to monkeypatch the inner workings: https://github.com/python/cpython/blob/034bb70aaad5622bd53bad21595b18b8f4407984/Lib/json/encoder.py#L224

    But you can try:

    import json
    
    
    def custom_make_iterencode(*args, **kwargs):
        def floatstr(
            o,
            allow_nan=True,
            _repr=float.__repr__,
            _inf=float("inf"),
            _neginf=float("-inf"),
        ):
            if o != o:
                text = "None"  # <--- Here is the actuall NaN replacement!
            elif o == _inf:
                text = "Infinity"
            elif o == _neginf:
                text = "-Infinity"
            else:
                return _repr(o)
    
            if not allow_nan:
                raise ValueError(
                    "Out of range float values are not JSON compliant: " + repr(o)
                )
    
            return text
    
        return orig_make_iterencode(
            {},  # markers
            json.encoder.JSONEncoder.default,
            json.encoder.py_encode_basestring_ascii,
            None,  # indent
            floatstr,
            json.encoder.JSONEncoder.key_separator,
            json.encoder.JSONEncoder.item_separator,
            False,  # sort_keys
            False,  # skip_keys
            False,  # _one_shot
        )
    
    
    orig_make_iterencode = json.encoder._make_iterencode
    
    # https://github.com/python/cpython/blob/034bb70aaad5622bd53bad21595b18b8f4407984/Lib/json/encoder.py#L247
    json.encoder._make_iterencode = custom_make_iterencode
    json.encoder.c_make_encoder = None # <-- pretend we don't have C version of the function
    
    
    # finally you can do:
    
    print(json.dumps(float("nan")))
    print(json.dumps([float("nan")]))
    
    # don't forget to set the `json.encoder._make_iterencode` and `json.encoder.c_make_encoder` back!
    

    Prints:

    None
    [None]