regexlinuxbashperlksh

Perl oneliner to match exact word in a path on many different values with special characters


How do I match exactly the $TARGET_NAME value from find /tmp -type l -exec ls -l output?

 $ find /tmp -type l -exec ls -l 2>/dev/null {} +
 lrwxrwxrwx 1 root root  24 Mar 18 12:41 /tmp/test/link -> /usr/admin/Collect_tests
 lrwxrwxrwx 1 root root  43 Mar 18 12:41 /tmp/test/link1 -> /usr/admin/Collect_tests/Upload.CM@.www.com
 lrwxrwxrwx 1 root root  68 Mar 18 12:41 /tmp/test/link2 -> /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com
 lrwxrwxrwx 1 root root 100 Mar 18 12:42 /tmp/test/link3 -> /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com/List.files.emails.dummy*Printed
 lrwxrwxrwx 1 root root  92 Mar 18 12:42 /tmp/test/link4 -> /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com/List.files@emails.dummy

Examples of values

 TARGET_NAME=Upload.CM@.www.com
 TARGET_NAME=Upload_Shema@@@.DATA.com
 TARGET_NAME=List.files.emails.dummy*Printed

Target: print: "link name" and "PATH" (last field ) only if $TARGET_NAME match exactly the word in the last field.

Example (when we want to match exact - while TARGET_NAME=Upload_Shema@@@.DATA.com then):

The results will display as the following

/tmp/test/link2 /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com
/tmp/test/link3 /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com/List.files.emails.dummy*Printed
/tmp/test/link4 /usr/admin/Collect_tests/Upload.CM@.www.com/Upload_Shema@@@.DATA.com/List.files@emails.dummy

There are a few conditions:

1) Need to match only the last field (from ls -l output)

Example

      /usr/admin/Collect_tests/Upload.CM@.www.com

2) $TARGET_NAME value should match the whole word

Example of full match ( while TARGET_NAME=Upload.CM@.www.com ):

    /usr/admin/Collect_tests/Upload.CM@.www.com

Example of a non-full match:

    /usr/admin/Collect_tests/Upload.CM@.www.c

3) A backslash ("/") must exist on the left side of $TARGET_NAME, and a backslash or the end of the string must be found on the right of $TARGET_NAME.

4) Need to escape special characters as: " / " , " @ " . " * " , etc

5) The code will be part of a ksh script (and could beimplemented by a Perl oneliner or AWK or ksh etc .. )

Example

   find /tmp -type l -exec ls -l 2>/dev/null {} + | < Perl one liner .............. >    

Solution

  • As mentioned in response to your last question (since deleted), parsing ls output is very suboptimal. readlink can be used instead.

    find /tmp -type l -exec \
       perl -e'
          my $TARGET_NAME = shift;
          for (@ARGV) {
             my $p = readlink($_);
             $p =~ m{(?:^|/)\Q$TARGET_NAME\E(?:/|\z)}
                or next;
             print("$_\t$p\n");
          }
       ' "$TARGET_NAME" {} \;
    

    Or more efficiently,

    perl -MFile::Find::Rule -e'
       my ($TARGET_NAME, $BASE) = @ARGV;
       for (File::Find::Rule->symlink->in($BASE)) {
          my $p = readlink($_);
          $p =~ m{(?:^|/)\Q$TARGET_NAME\E(?:/|\z)}
             or next;
          print("$_\t$p\n");
       }
    ' "$TARGET_NAME" /tmp
    

    As requested, this will match

    TARGET_NAME
    TARGET_NAME/
    TARGET_NAME/x
    .../TARGET_NAME
    .../TARGET_NAME/
    .../TARGET_NAME/x
    

    but not

    TARGET_NAMEx/...
    .../TARGET_NAMEx
    .../TARGET_NAMEx/...
    xTARGET_NAME/...
    .../xTARGET_NAME
    .../xTARGET_NAME/...
    

    Note: Change find ... -exec ... \; to find ... -exec ... + if your find supports it.