I have a file which contains data with many persons & their scores. i am trying to take avg of scores of each person.
Person 1
Scores (\
"0.06, 0.01, 0.07, 0.07, 0.75", \
"0.05, 0.08, 0.01, 0.09, 0.08", \
"0.10, 0.10, 0.11, 0.12, 0.10", \
"0.18, 0.19, 0.20, 0.20, 0.19", \
"0.31, 0.32, 0.32, 0.33, 0.32");
}
Person 2
Scores (\
"0.06, 0.01, 0.07, 0.07, 0.75", \
"0.05, 0.08, 0.01, 0.09, 0.08", \
"0.10, 0.10, 0.11, 0.12, 0.10", \
"0.18, 0.19, 0.20, 0.20, 0.19", \
"0.31, 0.32, 0.32, 0.33, 0.32");
}
Expected Output
Person 1 - (avg value)
Person 2 - (avg value)
What i tried:
open($in, “<file.txt>”)
or die;
while(<$in>) {
if (/Person/) {
if (/Scores/../}/) {
$_ =~ s/,//g;
$_ =~ s/\\//g; # removing all unwanted characters to take avg of numbers
$_ =~ s/"//g;
$_ =~ s/values//g;
$_ =~ s/\(//g;
$_ =~ s/\)//g;
$_ =~ s/;//g;
$_ =~ s/}/ /g;
@a1 = split(" ",$_);
}
}
}
After this point i am not able to store the values in array for further computation.
The fundamental problem with your code is you are walking a line at a time through your input data, but you code is making the assumption that all the parts you need to parse the code are present in that one line.
For example these two statements are checking first for the literal Person
on the current line, and then checking for the literal string Scores
on the same line. That will never match -- the two literal strings are on different lines
if (/Person/) {
if (/Scores/../}/) {
There are lots of approaches to this problem, here is one of them.
use strict;
use warnings ;
use List::Util qw(sum);
# read the complete file into $data
my $data ;
{
local $/;
$data = <DATA>;
}
# repeatedly match each Person/Scores section
while ($data =~ /Person\s+(\S+)\s+Scores\s+\((.+?)\)/smg)
{
my $person = $1;
my $scores = $2;
# now split $scores into the individual values - store in @scores
my @scores;
while ($scores =~ /(\d+\.\d+)/smg)
{
push @scores, $1
}
# @scores now holds the individual values.
# Can work out the average from them
my $average = sum(@scores) / scalar @scores;
print "Person $person - $average\n";
}
__DATA__
Person 1
Scores (
"0.06, 0.01, 0.07, 0.07, 0.75",
"0.05, 0.08, 0.01, 0.09, 0.08",
"0.10, 0.10, 0.11, 0.12, 0.10",
"0.18, 0.19, 0.20, 0.20, 0.19",
"0.31, 0.32, 0.32, 0.33, 0.32");
}
Person 2
Scores (
"0.06, 0.01, 0.07, 0.07, 0.75",
"0.05, 0.08, 0.01, 0.09, 0.08",
"0.10, 0.10, 0.11, 0.12, 0.10",
"0.18, 0.19, 0.20, 0.20, 0.19",
"0.31, 0.32, 0.32, 0.33, 0.32");
}
output is
Person 1 - 0.1744
Person 2 - 0.1744
The DATA
filehandle is automatically opened against the current script file. Its file pointer is set to the line directly after the line that begins with __DATA__
. I've used that to store the data you had in file.txt
.
For example, assume the scriipt test.pl
contains this
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
while (<DATA>)
{
chomp;
say uc $_ ;
}
__DATA__
alpha
beta
gamma
delta
running that script gives
$ perl /tmp/data.pl
ALPHA
BETA
GAMMA
DELTA
The use of the DATA
file handle is super-convenient for creating a fully self-contained test script. Means you aren't dependent on an extra file.