pythonfile-processing

JIL File processing using Python


I have a JIL file from Autosys that looks like below.

    /* ----------------- Box1 ----------------- */
insert_job: Box1   job_type: BOX 
permission: 
date_conditions: 0
alarm_if_fail: 1

/* ----------------- Job2 ----------------- */ 

insert_job: Job2   job_type: POJO 
box_name: Box1
machine: testm
permission: 
date_conditions: 0
condition: s(Job5) & s(Job6) & s(Job7)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks
send_notification: 0
notification_msg: "Job2 Status"
method_name: runJob
j2ee_parameter: String=https://sample.com
j2ee_parameter: String=32322sdd
j2ee_parameter: String=12
j2ee_parameter: int=120
j2ee_parameter: long=30000


/* ----------------- Job3 ----------------- */ 

insert_job: Job3   job_type: POJO 
box_name: Box1
machine: testm
permission: 
date_conditions: 0
condition: s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(Box13)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks
send_notification: 0
notification_msg: "Job3 Status"
method_name: runJob
j2ee_parameter: String=https://sample.com
j2ee_parameter: String=fgfgf1
j2ee_parameter: String=002
j2ee_parameter: int=120
j2ee_parameter: long=30000


/* ----------------- Box2 ----------------- */ 

insert_job: Box2   job_type: BOX 
box_name: Box1
permission: 
date_conditions: 0
condition: s(Job2)
alarm_if_fail: 1
alarm_if_terminated: 1
application: Databricks

I need to create an excel file using the above file which will look like below.

Job box_name condition
Box1 Box
Job2 Box1 s(Job5) & s(Job6) & s(Job7)
Job3 Box1 s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(Box13)
Box2 Box1 s(Job2)

Solution

  • A basic/naïve approach is to split the blocks and process each one separatly with regex :

    import re
    import pandas as pd
    from collections import defaultdict
    ​
    with open("input.jil", "r") as file:
        jil_data = file.read()
    ​
    blocks = re.split(r"(/\*.*?\*/\s*)\n", jil_data, flags=re.DOTALL)[1:]
    ​
    data = defaultdict(list)
    for i in range(0, len(blocks), 2):
        insert_job = re.search(r"insert_job:\s+(\S+)", blocks[i+1])
        data["insert_job"].append(insert_job.group(1) if insert_job else None)
        box_name = re.search(r"box_name:\s+(\S+)", blocks[i+1])
        data["box_name"].append(box_name.group(1) if box_name else None)
        condition = re.search(r"condition:\s+(.*)", blocks[i+1])
        data["condition"].append(condition.group(1) if condition else None)
    ​
    df = pd.DataFrame(data)
    
    #df.to_excel("output.xlsx", index=False, sheet_name="JIL") #uncomment this line to make a spreadsheet
    

    Output : ​

    print(df)
    
      insert_job box_name                                          condition
    0       Box1     None                                               None
    1       Job2     Box1                        s(Job5) & s(Job6) & s(Job7)
    2       Job3     Box1  s(Box10) & s(Box11) & s(Box12) & s(Job40) & s(...
    3       Box2     Box1                                            s(Job2)