I have a JIL file from Autosys that looks like below.
/* ----------------- Box1 ----------------- */
insert_job: Box1 job_type: BOX
permission:
date_conditions: 0
alarm_if_fail: 1
/* ----------------- Job2 ----------------- */
insert_job: Job2 job_type: POJO
box_name: Box1
machine: testm
permission:
date_conditions: 0
condition: s(Job5) & s(Job6) & s(Job7)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks
send_notification: 0
notification_msg: "Job2 Status"
method_name: runJob
j2ee_parameter: String=https://sample.com
j2ee_parameter: String=32322sdd
j2ee_parameter: String=12
j2ee_parameter: int=120
j2ee_parameter: long=30000
/* ----------------- Job3 ----------------- */
insert_job: Job3 job_type: POJO
box_name: Box1
machine: testm
permission:
date_conditions: 0
condition: s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(Box13)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks
send_notification: 0
notification_msg: "Job3 Status"
method_name: runJob
j2ee_parameter: String=https://sample.com
j2ee_parameter: String=fgfgf1
j2ee_parameter: String=002
j2ee_parameter: int=120
j2ee_parameter: long=30000
/* ----------------- Box2 ----------------- */
insert_job: Box2 job_type: BOX
box_name: Box1
permission:
date_conditions: 0
condition: s(Job2)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks
I need to create an excel file using the above file which will look like below.
Job | box_name | condition |
---|---|---|
Box1 | Box | |
Job2 | Box1 | s(Job5) & s(Job6) & s(Job7) |
Job3 | Box1 | s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(Box13) |
Box2 | Box1 | s(Job2) |
A basic/naïve approach is to split the blocks and process each one separatly with regex :
import re
import pandas as pd
from collections import defaultdict
with open("input.jil", "r") as file:
jil_data = file.read()
blocks = re.split(r"(/\*.*?\*/\s*)\n", jil_data, flags=re.DOTALL)[1:]
data = defaultdict(list)
for i in range(0, len(blocks), 2):
insert_job = re.search(r"insert_job:\s+(\S+)", blocks[i+1])
data["insert_job"].append(insert_job.group(1) if insert_job else None)
box_name = re.search(r"box_name:\s+(\S+)", blocks[i+1])
data["box_name"].append(box_name.group(1) if box_name else None)
condition = re.search(r"condition:\s+(.*)", blocks[i+1])
data["condition"].append(condition.group(1) if condition else None)
df = pd.DataFrame(data)
#df.to_excel("output.xlsx", index=False, sheet_name="JIL") #uncomment this line to make a spreadsheet
Output :
print(df)
insert_job box_name condition
0 Box1 None None
1 Job2 Box1 s(Job5) & s(Job6) & s(Job7)
2 Job3 Box1 s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(...
3 Box2 Box1 s(Job2)