pythonpython-3.xlist-comprehensionraiseerror

Python Raising Errors within List Comprehension (or a better alternative)


I have a nested structure read from a json string, that looks similar to the following...

[
  {
    "id": 1,
    "type": "test",
    "sub_types": [
      {
        "id": "a",
        "type": "sub-test",
        "name": "test1"
      },
      {
        "id": "b",
        "name": "test2",
        "key_value_pairs": [
          {
            "key": 0,
            "value": "Zero"
          },
          {
            "key": 1,
            "value": "One"
          }
        ]
      }
    ]
  }
]

I need to extract and pivot the data, ready to be inserted in to a database...

[
  (1, "b", 0, "Zero"),
  (1, "b", 1, "One")
]

I'm doing the following...

data_list = [
  (
    type['id'],
    sub_type['id'],
    key_value_pair['key'],
    key_value_pair['value']
  )
  for type in my_parsed_json_array
  if 'sub_types' in type
  for sub_type in type['sub_types']
  if 'key_value_pairs' in sub_type
  for key_value_pair in sub_type['key_value_pairs']
]

So far, so good.

What I need to do next, however, is enforce some constraints. For example...

if type['type'] == 'test': raise ValueError('[test] types can not contain key_value_pairs.')

But I can't put that in to the comprehension. And I don't want to resort to loops. My best thought so far is...

def make_row(type, sub_type, key_value_pair):
    if type['type'] == 'test': raise ValueError('sub-types of a [test] type can not contain key_value_pairs.')
    return (
        type['id'],
        sub_type['id'],
        key_value_pair['key'],
        key_value_pair['value']
    )

data_list = [
  make_row(
    type,
    sub_type,
    key_value_pair
  )
  for type in my_parsed_json_array
  if 'sub_types' in type
  for sub_type in type['sub_types']
  if 'key_value_pairs' in sub_type
  for key_value_pair in sub_type['key_value_pairs']
]

That works, but it will make the check for each and every key_value_pair, which feels redundant. (Each set of key value pairs could have thousands of pairs, and the check only needs to be made once to know that they're all fine.)

Also, there will be other rules similar to this, that apply at different levels of the hierarchy. Such as "test" types can only contain "sub_test" sub_types.

What are the options other than those above?


Solution

  • You should read about how to validate your json data and specify explicit schema constraints with JSON Schema This library allows you to set keys which are required, specify default values, add type validation, etc.

    This library has It's python implementation here: jsonschema package

    EXAMPLE:

    from jsonschema import Draft6Validator
    
    schema = {
        "$schema": "https://json-schema.org/schema#",
    
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "email": {"type": "string"},
        },
        "required": ["email"]
    }
    Draft6Validator.check_schema(schema)