pythonjsondatabaseif-statementsimplify

Simplifying nested if statements in Python


I am working on a JSON database that stores multiple choice questions. I defined a function that takes some arguments that are NOT obligatory in order to fetch questions based on your needs (difficulty, subject, keyword...) and so I feel the need to use many if-else statements.

The following (symbolic) code works fine, but it is hard to digest.

def getq(mine, subject, difficulty, keyword):
    with open("questions.json", "r") as f:
        data = json.load(f)

    for id in data.keys():
        if mine == True:
            if subject == True:
                if difficulty == True:
                    if keyword == True:
                        ...
                    else:
                        ...
                else:
                    if keyword == True:
                        ...
                    else:
                        ...
            else:
                if difficulty == True:
                    if keyword == True:
                        ...
                    else:
                        ...
                else:
                    if keyword == True:
                        ...
                    else:
                        ...
        elif subject == True:
            if difficulty == True:
                if keyword == True:
                    ...
                else:
                    ...
            else:
                if keyword == True:
                    ...
                else:
                    ...
        else:
            if difficulty == True:
                if keyword == True:
                    ...
                else:
                    ...
            else:
                if keyword == True:
                    ...
                else:
                    ...

It uses the following JSON structure:

{
    "1": {
        "question": "What's the capital of Spain?",
        "subject": "",
        "date" : "",
        "timesright" : 2,
        "timeswrong" : 3,
        "difficulty": "",  # This would be a function of timesright and timeswrong
        "keywords" : "geography, general knowledge",
        "explanation": "",
        "answers": [
            {
                "id": 1,
                "answer": "Paris",
                "is_correct": false
            },
            {
                "id": 2,
                "answer": "Madrid",
                "is_correct": true
            },
            {
                "id": 3,
                "answer": "Roma",
                "is_correct": false
            },
            {
                "id": 4,
                "answer": "Moscow",
                "is_correct": false
            }
        ]
    }
}

How could I make it easier to look at and/or more efficient? Maybe numpy? I would highly appreciate other suggestions. Perhaps using JSON isn't the best idea given that I intend on editing it via commands? I am a newbie, so I do not have a clue.


Solution

  • As suggested in comments - The simplest approach is to store the json as a pandas dataframe.

    Consider the following json object as a sample:

    {
        "1": {
            "question": "some question?",
            "choices": [1,2,3,4],
            "answer": 0,
            "subject": "science",
            "difficulty": "medium",
            "mine": "foobar"
        },
        "2": {
            "question": "some question?",
            "choices": [1,2,3,4],
            "answer": 0,
            "subject": "math",
            "difficulty": "medium",
            "mine": "foobar"
        },
        "3": {
            "question": "some question?",
            "choices": [1,2,3,4],
            "answer": 0,
            "subject": "math",
            "difficulty": "medium",
            "mine": "foobar"
        }
    }
    

    Assuming this is read as json_str, function getq will look something like:

    def getq(mine='', subject='', difficulty='', keyword=''):
        df = pd.read_json(json_str, orient='index').reset_index()
    
        df = df[df['mine'].str.contains(mine)]
        df = df[df['subject'].str.contains(subject)]
        df = df[df['difficulty'].str.contains(difficulty)]
        df['keyword_check'] = df.apply(lambda x: keyword in x['question'].split(' '), axis=1)
        return df[df['keyword_check']]
    
    

    If your json data is complex, has varying schema, and is very large you might be better off using mongodb and it's querying engine

    some details of your json structure (a sample perhaps) will be helpful

    EDIT:

    Your JSON structure looks fairly straightforward, I would recommend you use the pandas approach. Further - For parsing JSON objects into python understandable objects I would recommend using dataclasses or pydantic. These allow you to interact with nested data in a more pythonic fashion. Adding an example (partial) below:

    from dataclasses import dataclass
    from typing import List
    import json
    
    
    json_str = '''{
        "1": {
            "question": "What's the capital of Spain?",
            "subject": "",
            "date" : "",
            "timesright" : 2,
            "timeswrong" : 3,
            "difficulty": "",
            "keywords" : "geography, general knowledge",
            "explanation": "",
            "answers": [
                {
                    "id": 1,
                    "answer": "Paris",
                    "is_correct": false
                },
                {
                    "id": 2,
                    "answer": "Madrid",
                    "is_correct": true
                },
                {
                    "id": 3,
                    "answer": "Roma",
                    "is_correct": false
                },
                {
                    "id": 4,
                    "answer": "Moscow",
                    "is_correct": false
                }
            ]
        }
    }'''
    
    
    @dataclass
    class Answer:
        id: int
        answer: str
        is_correct: bool
    
    
    @dataclass
    class Question:
        question: str
        subject: str
        date: str
        timesright: int
        timeswrong: int
        difficulty: str
        keywords: str
        explanation: str
        answers: List[Answer]
    
    
    for question_id, question_dict in json.loads(json_str).items():
        question = Question(**question_dict)
        if question.difficulty == "high":
            print("found a difficult one!")