We want to parse a file and create a data structure of some sort to be used later (in Python). The content of file looks like this:
plan HELLO
feature A
measure X :
src = "Type ,N ame"
endmeasure //X
measure Y :
src = "Type ,N ame"
endmeasure //Y
feature Aa
measure AaX :
src = "Type ,N ame"
endmeasure //AaX
measure AaY :
src = "Type ,N ame"
endmeasure //AaY
feature Aab
.....
endfeature // Aab
endfeature //Aa
endfeature // A
feature B
......
endfeature //B
endplan
plan HOLA
endplan //HOLA
So there's a file that contain one or more plans and then each plan contains one or more feature, further each feature contains a measure that contains info (src, type, name) and feature can further contain more features.
We need to parse through the file and create a data structure that would have
plan (HELLO)
------------------------------
↓ ↓
Feature A Feature B
---------------------------- ↓
↓ ↓ ↓ ........
Measure X Measure Y Feature Aa
------------------------------
↓ ↓ ↓
Measure AaX Measure AaY Feature Aab
↓
.......
I am trying to parse through the file line by line and create a list of lists that would contain plan -> feature -> measure, feature
Here is a function that would turn your string into a dictionary:
def getplans(s):
stack = [{}]
for line in s.splitlines():
if "=" in line: # leaf
key, value = line.split("=", 1)
stack[-1][key.strip()] = value.strip(' "')
elif line.strip()[:3] == "end":
stack.pop()
elif line.strip():
collection, name, *_ = line.split()
stack.append({})
stack[-2].setdefault(collection + "s", {})[name] = stack[-1]
return stack[0]
Here is an example call:
s = """plan HELLO
feature A
measure X :
src = "Type, Name"
endmeasure //X
measure Y :
src = "Type, Name"
endmeasure //Y
feature Aa
measure AaX :
src = "Type, Name"
endmeasure //AaX
measure AaY :
src = "Type, Name"
endmeasure //AaY
feature Aab
measure Car :
src = "Model, Make"
endmeasure //car
endfeature // Aab
endfeature //Aa
endfeature // A
feature B
measure Hotel :
src = "Stars, Reviews"
endmeasure //Hotel
endfeature //B
endplan
plan HOLA
endplan //HOLA
"""
import json
print(json.dumps(getplans(s), indent=4))
The output:
{
"plans": {
"HELLO": {
"features": {
"A": {
"measures": {
"X": {
"src": "Type ,N ame"
},
"Y": {
"src": "Type ,N ame"
}
},
"features": {
"Aa": {
"measures": {
"AaX": {
"src": "Type ,N ame"
},
"AaY": {
"src": "Type ,N ame"
}
},
"features": {
"Aab": {
"measures": {
"Car": {
"src": "Model, Make"
}
}
}
}
}
}
},
"B": {
"measures": {
"Hotel": {
"src": "Stars, Reviews"
}
}
}
}
},
"HOLA": {}
}
}
If your input has some other syntax -- not included in your question -- you'll probably need to tune the script further to deal with that.