arraysperlmatrixfile-handling

Add the scores using perl with its respective person


I have a file which contains data with many persons & their scores. i am trying to take avg of scores of each person.

Person 1
Scores (\
"0.06, 0.01, 0.07, 0.07, 0.75", \
"0.05, 0.08, 0.01, 0.09, 0.08", \
"0.10, 0.10, 0.11, 0.12, 0.10", \
"0.18, 0.19, 0.20, 0.20, 0.19", \
"0.31, 0.32, 0.32, 0.33, 0.32");
}
Person 2
Scores (\
"0.06, 0.01, 0.07, 0.07, 0.75", \
"0.05, 0.08, 0.01, 0.09, 0.08", \
"0.10, 0.10, 0.11, 0.12, 0.10", \
"0.18, 0.19, 0.20, 0.20, 0.19", \
"0.31, 0.32, 0.32, 0.33, 0.32");
}

Expected Output

Person 1 - (avg value)
Person 2 - (avg value)

What i tried:

open($in, “<file.txt>”)
    or die;

while(<$in>) {
    if (/Person/) {
        if (/Scores/../}/) {
            $_ =~ s/,//g;
            $_ =~ s/\\//g;  # removing all unwanted characters to take avg of numbers 
            $_ =~ s/"//g;
            $_ =~ s/values//g;
            $_ =~ s/\(//g;
            $_ =~ s/\)//g;
            $_ =~ s/;//g;
            $_ =~ s/}/ /g;
            @a1 = split(" ",$_);
        }
    }
}

After this point i am not able to store the values in array for further computation.


Solution

  • The fundamental problem with your code is you are walking a line at a time through your input data, but you code is making the assumption that all the parts you need to parse the code are present in that one line.

    For example these two statements are checking first for the literal Person on the current line, and then checking for the literal string Scores on the same line. That will never match -- the two literal strings are on different lines

        if (/Person/) {
            if (/Scores/../}/) {
    

    There are lots of approaches to this problem, here is one of them.

    use strict;
    use warnings ;
    
    use List::Util qw(sum);
    
    
    # read the complete file into $data
    my $data ;
    {
        local $/;
        $data = <DATA>;
    }
    
    # repeatedly match each Person/Scores section
    while ($data =~ /Person\s+(\S+)\s+Scores\s+\((.+?)\)/smg)
    {
        my $person = $1;
        my $scores = $2;
    
        # now split $scores into the individual values - store in @scores
        my @scores;
        while ($scores =~ /(\d+\.\d+)/smg)
        {
            push @scores, $1
        }
    
        # @scores now holds the individual values. 
        # Can work out the average from them
        my $average = sum(@scores) / scalar @scores;
    
        print "Person $person - $average\n";
    }
    
    __DATA__
    Person 1
    Scores (
    "0.06, 0.01, 0.07, 0.07, 0.75", 
    "0.05, 0.08, 0.01, 0.09, 0.08", 
    "0.10, 0.10, 0.11, 0.12, 0.10", 
    "0.18, 0.19, 0.20, 0.20, 0.19", 
    "0.31, 0.32, 0.32, 0.33, 0.32");
    }
    Person 2
    Scores (
    "0.06, 0.01, 0.07, 0.07, 0.75", 
    "0.05, 0.08, 0.01, 0.09, 0.08", 
    "0.10, 0.10, 0.11, 0.12, 0.10", 
    "0.18, 0.19, 0.20, 0.20, 0.19", 
    "0.31, 0.32, 0.32, 0.33, 0.32");
    }
    

    output is

    Person 1 - 0.1744
    Person 2 - 0.1744
    

    How the DATA filehandle works

    The DATA filehandle is automatically opened against the current script file. Its file pointer is set to the line directly after the line that begins with __DATA__. I've used that to store the data you had in file.txt.

    For example, assume the scriipt test.pl contains this

    #!/usr/bin/perl
    
    use strict;
    use warnings;
    use feature 'say';
    
    while (<DATA>)
    {
        chomp;
    
        say uc $_ ;
    }
    
    __DATA__
    alpha
    beta
    gamma
    delta
    

    running that script gives

    $ perl /tmp/data.pl
    ALPHA
    BETA
    GAMMA
    DELTA
    

    The use of the DATA file handle is super-convenient for creating a fully self-contained test script. Means you aren't dependent on an extra file.