I am working on a JSON database that stores multiple choice questions. I defined a function that takes some arguments that are NOT obligatory in order to fetch questions based on your needs (difficulty, subject, keyword...) and so I feel the need to use many if-else statements.
The following (symbolic) code works fine, but it is hard to digest.
def getq(mine, subject, difficulty, keyword):
with open("questions.json", "r") as f:
data = json.load(f)
for id in data.keys():
if mine == True:
if subject == True:
if difficulty == True:
if keyword == True:
...
else:
...
else:
if keyword == True:
...
else:
...
else:
if difficulty == True:
if keyword == True:
...
else:
...
else:
if keyword == True:
...
else:
...
elif subject == True:
if difficulty == True:
if keyword == True:
...
else:
...
else:
if keyword == True:
...
else:
...
else:
if difficulty == True:
if keyword == True:
...
else:
...
else:
if keyword == True:
...
else:
...
It uses the following JSON structure:
{
"1": {
"question": "What's the capital of Spain?",
"subject": "",
"date" : "",
"timesright" : 2,
"timeswrong" : 3,
"difficulty": "", # This would be a function of timesright and timeswrong
"keywords" : "geography, general knowledge",
"explanation": "",
"answers": [
{
"id": 1,
"answer": "Paris",
"is_correct": false
},
{
"id": 2,
"answer": "Madrid",
"is_correct": true
},
{
"id": 3,
"answer": "Roma",
"is_correct": false
},
{
"id": 4,
"answer": "Moscow",
"is_correct": false
}
]
}
}
How could I make it easier to look at and/or more efficient? Maybe numpy? I would highly appreciate other suggestions. Perhaps using JSON isn't the best idea given that I intend on editing it via commands? I am a newbie, so I do not have a clue.
As suggested in comments - The simplest approach is to store the json as a pandas dataframe.
Consider the following json object as a sample:
{
"1": {
"question": "some question?",
"choices": [1,2,3,4],
"answer": 0,
"subject": "science",
"difficulty": "medium",
"mine": "foobar"
},
"2": {
"question": "some question?",
"choices": [1,2,3,4],
"answer": 0,
"subject": "math",
"difficulty": "medium",
"mine": "foobar"
},
"3": {
"question": "some question?",
"choices": [1,2,3,4],
"answer": 0,
"subject": "math",
"difficulty": "medium",
"mine": "foobar"
}
}
Assuming this is read as json_str
, function getq
will look something like:
def getq(mine='', subject='', difficulty='', keyword=''):
df = pd.read_json(json_str, orient='index').reset_index()
df = df[df['mine'].str.contains(mine)]
df = df[df['subject'].str.contains(subject)]
df = df[df['difficulty'].str.contains(difficulty)]
df['keyword_check'] = df.apply(lambda x: keyword in x['question'].split(' '), axis=1)
return df[df['keyword_check']]
If your json data is complex, has varying schema, and is very large you might be better off using mongodb and it's querying engine
some details of your json structure (a sample perhaps) will be helpful
EDIT:
Your JSON structure looks fairly straightforward, I would recommend you use the pandas approach.
Further - For parsing JSON objects into python understandable objects I would recommend using dataclasses
or pydantic
. These allow you to interact with nested data in a more pythonic fashion. Adding an example (partial) below:
from dataclasses import dataclass
from typing import List
import json
json_str = '''{
"1": {
"question": "What's the capital of Spain?",
"subject": "",
"date" : "",
"timesright" : 2,
"timeswrong" : 3,
"difficulty": "",
"keywords" : "geography, general knowledge",
"explanation": "",
"answers": [
{
"id": 1,
"answer": "Paris",
"is_correct": false
},
{
"id": 2,
"answer": "Madrid",
"is_correct": true
},
{
"id": 3,
"answer": "Roma",
"is_correct": false
},
{
"id": 4,
"answer": "Moscow",
"is_correct": false
}
]
}
}'''
@dataclass
class Answer:
id: int
answer: str
is_correct: bool
@dataclass
class Question:
question: str
subject: str
date: str
timesright: int
timeswrong: int
difficulty: str
keywords: str
explanation: str
answers: List[Answer]
for question_id, question_dict in json.loads(json_str).items():
question = Question(**question_dict)
if question.difficulty == "high":
print("found a difficult one!")