perlautosysjil

Parsing AutoSys JIL with perl


I have an assignment to parse out AutoSys JIL files. This is a JIL job definition, it is a config file that the AUTOSYS scheduler reads in and runs. , Imagine a file formatted like this, with thousands of job definitions like the one below, stacked on top of each other in the exact same format. All beginning with the header and ending with the timezone.

/* ----------------- COME_AND_PLAY_WITH_US_DANNY ----------------- */

insert_job: COME_AND_PLAY_WITH_US_DANNY   job_type: CMD
command: /bin/bash -ls
machine: capser.com
owner: twins
permission: foo,foo
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_times: "04:00"
description: "Forever, and ever and ever"
std_in_file: "/home/room217"
std_out_file: "${CASPERSYSLOG}/room217.out"
std_err_file: "${CASPERSYSLOG}/room217.err
alarm_if_fail: 1
profile: "/autosys_profile"
timezone: US/Eastern

This is the script. I need to extract the job, machine and command from the job definition above. It works fine, but eventually I am going to want to store the information in some kind of container and send it, while this script writes out the results line by line in the terminal. Right now I am redirecting the results to a temporary file.

#!/foo/bar/perl5/core/5.10/exec/bin/perl
use strict;
use warnings;
use File::Basename ;

my($job, $machine, $command)  ;
my $filename = '/tmp/autosys.jil_output.padc';
open(my $fh, '<:encoding(UTF-8)', $filename)
  or die "Could not open file '$filename' $!";
my $count = 0;
while (my $line = <$fh>) {
    #chomp $line;
    if($line =~ /\/\* -{17} \w+ -{17} \*\//) {
    $count = 1; }
    elsif($line =~  /(alarm_if_fail:)/) {
    $count = 0 ; }
    elsif ($count) {
             if ($line =~ m/insert_job: (\w+).*job_type: CMD/) {
             $job = $1   ;
             }
             elsif($line =~ m/command:(.*)/) {
             $command = $1  ;
             }
             elsif($line =~ m/machine:(.*)/) {
             $machine = $1  ;

             print "$job\t $machine\t $command \n ";      
             }
        }


    #sleep 1 ;
   }

My question is When I place the print $job, $machine $command statement within the last elsif statement, it works fine. However when I place it out side of the last elsif statement, like the example below the output is duplicated over and over again - each line is duplicated like four to five times in the output. I do not understand that. How come I have to put the print statement within the last elsif statement to get the script to print out one line at a time, correctly.

elsif ( $line =~ m/machine:(.*)/ ) {
    $machine = $1;
}

print "$job\t $machine\t $command \n ";


Reformat of above code for readability

#!/foo/bar/perl5/core/5.10/exec/bin/perl

use strict;
use warnings;

use File::Basename;

my ( $job, $machine, $command );
my $filename = '/tmp/autosys.jil_output.padc';

open( my $fh, '<:encoding(UTF-8)', $filename )
        or die "Could not open file '$filename' $!";

my $count = 0;

while ( my $line = <$fh> ) {

    #chomp $line;
    if ( $line =~ /\/\* -{17} \w+ -{17} \*\// ) {
        $count = 1;
    }
    elsif ( $line =~ /(alarm_if_fail:)/ ) {
        $count = 0;
    }
    elsif ( $count ) {

        if ( $line =~ m/insert_job: (\w+).*job_type: CMD/ ) {
            $job = $1;
        }
        elsif ( $line =~ m/command:(.*)/ ) {
            $command = $1;
        }
        elsif ( $line =~ m/machine:(.*)/ ) {
            $machine = $1;
            print "$job\t $machine\t $command \n ";
        }
    }

    # sleep 1;
}

Solution

  • As I've said in my comment, please format your code sensibly. Without doing so you will get people either ignoring your question, or being grumpy about answering like me

    Please also make sure that you have given a proper example of your problem with runnable code and data. If I execute the code that you show against your sample data then everything works fine, so what problem do you have?

    As far as I can tell, you want to display the insert_job, machine, and command fields from every JIL data block whose job_type field is CMD. Is that right?

    Here's my best guess: xxfelixxx's comment is correct, and you are simply printing all the fields that you have collected every time you read a line from the data file

    My solution is to transform each data block into a hash.

    It is dangerous to use comments to delineate the blocks, and you have given no information about the ordering of the fields, so I have to assume that the insert_job field comes first. That makes sense if the file is to be used as a list of imperatives, but the additional job_type field on the same line is weird. Is that a genuine sample of your data, or another problem with your example?

    Here's a working solution to my imagination of your problem.

    #!/foo/bar/perl5/core/5.10/exec/bin/perl
    
    use strict;
    use warnings 'all';
    
    my $data = do {
        local $/;
        <DATA>;
    };
    
    my @data = grep /:/, split /^(?=insert_job)/m, $data;
    
    for ( @data ) {
    
        my %data = /(\w+) \s* : \s* (?| " ( [^""]+ ) " | (\S+) )/gx;
    
        next unless $data{job_type} eq 'CMD';
    
        print "@data{qw/ insert_job machine command /}\n";
    }
    
    
    __DATA__
    /* ----------------- COME_AND_PLAY_WITH_US_DANNY ----------------- */
    
    insert_job: COME_AND_PLAY_WITH_US_DANNY   job_type: CMD
    command: /bin/bash -ls
    machine: capser.com
    owner: twins
    permission: foo,foo
    date_conditions: 1
    days_of_week: mo,tu,we,th,fr
    start_times: "04:00"
    description: "Forever, and ever and ever"
    std_in_file: "/home/room217"
    std_out_file: "${CASPERSYSLOG}/room217.out"
    std_err_file: "${CASPERSYSLOG}/room217.err
    alarm_if_fail: 1
    profile: "/autosys_profile"
    timezone: US/Eastern
    
    /* ----------------- COME_AND_PLAY_WITH_US_AGAIN_DANNY ----------------- */
    
    insert_job: COME_AND_PLAY_WITH_US_AGAIN_DANNY   job_type: CMD
    command: /bin/bash -ls
    machine: capser.com
    owner: twins
    permission: foo,foo
    date_conditions: 1
    days_of_week: mo,tu,we,th,fr
    start_times: "04:00"
    description: "Forever, and ever and ever"
    std_in_file: "/home/room217"
    std_out_file: "${CASPERSYSLOG}/room217.out"
    std_err_file: "${CASPERSYSLOG}/room217.err
    alarm_if_fail: 1
    profile: "/autosys_profile"
    timezone: US/Eastern
    
    /* ----------------- NEVER_PLAY_WITH_US_AGAIN_DANNY ----------------- */
    
    insert_job: NEVER_PLAY_WITH_US_AGAIN_DANNY   job_type: CMD
    command: /bin/bash -rm *
    machine: capser.com
    owner: twins
    permission: foo,foo
    date_conditions: 1
    days_of_week: mo,tu,we,th,fr
    start_times: "04:00"
    description: "Forever, and ever and ever"
    std_in_file: "/home/room217"
    std_out_file: "${CASPERSYSLOG}/room217.out"
    std_err_file: "${CASPERSYSLOG}/room217.err
    alarm_if_fail: 1
    profile: "/autosys_profile"
    timezone: US/Eastern
    

    output

    COME_AND_PLAY_WITH_US_DANNY capser.com /bin/bash
    COME_AND_PLAY_WITH_US_AGAIN_DANNY capser.com /bin/bash
    NEVER_PLAY_WITH_US_AGAIN_DANNY capser.com /bin/bash