regexperl

Printing first instance of match in each line of file (Perl)


I have the following in an executable .pl file:

#!/usr/bin/env perl
$file = 'TfbG_peaks.txt';
open(INFO, $file) or die("Could not open file.");

foreach $line (<INFO>) {
        if ($line =~ m/[^_]*(?=_)/){
                #print $line; #this prints lines, which means there are matches
                print $1; #but this prints nothing
        }
}

Based on my reading at Perl read line by line and What does $1 mean in Perl?, print $1; should print the first match in each line, but it doesn't. Help!


Solution

  • No, $1 should print the string saved by so-called capture groups (created by the bracketing construct - ( ... )). For example:

    if ($line =~ m/([^_]*)(?=_)/){
       print $1; 
       # now this will print something, 
       # unless string begins from an underscore 
       # (which still matches the pattern, as * is read as 'zero or more instances')
       # are you sure you don't need `+` here?
    }
    

    The pattern in your original code didn't have any capture groups, that's why $1 was empty (undef, to be precise) there. And (?=...) didn't count, as these were used to add a look-ahead subexpression.