regexperllogfile-analysis

Regex a file with names and numbers/hexes alternated


Have a logfile my_file.txt which I know has the following structure:

milk marta 1231 joe 1223 ralph 7C3D
cookie
jam
milk marta 163 joe 23 ralph 7FFF

And what I want to do is extract with the help of regex in perl all the values of lines that start with the delimiter milk (which I know beforehand). Then I know I will have a number of value names (in the toy example marta, joe, ralph), each followed by a decimal or hexadecimal value.

And here comes my attempt so far

$my_file = "my_file.txt"
open my $fh, $my_file or die "Nope!"

while (my $line = <$fh>)
{
    if ($line =~ m/milk/)
    {
        # here iterate with a for or while loop over the $line
        # and read word and value
    }
}

Solution

  • Aside from the issue of determining whether a digit string is decimal or hex, something like this will extract the text that you need

    use strict;
    use warnings 'all';
    
    use Data::Dump 'dd';
    
    my @data;
    
    while ( <DATA> ) {
        my ($col1, %item) = split;
        push @data, \%item if $col1 eq 'milk';
    }
    
    dd \@data;
    
    __DATA__
    milk marta 1231 joe 1223 ralph 7C3D
    cookie
    jam
    milk marta 163 joe 23 ralph 7FFF
    

    output

    [
      { joe => 1223, marta => 1231, ralph => "7C3D" },
      { joe => 23, marta => 163, ralph => "7FFF" },
    ]