bashgrepnull-character

Grep not working when null character exists and -z option used


My grep command is it:

grep -Pzo -a 'Start(.*\n)*?.*?End' testfile.txt

And testfile.txt contains:

ItsTestStartFromHereEndNotVisibleStartFrom
HereEndOkNotVisible

the output:

$ grep -Pzo -a 'Start(.*\n)*?.*?End' testfile.txt

StartFromHereEndStartFrom
HereEnd

It works fine, but when null character exists between "Start" and "End", it does not work. I know it's because I have used "-z" option, but I need it for multi-line support.

For example, it's my content with null character:

ItsTestStartFrom[\x00]HereEndNotVisibleStart[\x00]From
HereEndOkNotVisible

Solution

  • You can use perl instead

    $ cat -A ip.txt
    ItsTestStartFrom^@HereEndNotVisibleStart^@From$
    HereEndOkNotVisible$
    
    $ # -0777 will slurp the entire file, so NUL won't create issues
    $ perl -0777 -ne 'print /Start.*?End/sg' ip.txt | cat -A
    StartFrom^@HereEndStart^@From$
    HereEnd$ 
    
    $ perl -0777 -nE 'say /Start.*?End/sg' ip.txt
    StartFromHereEndStartFrom
    HereEnd
    

    In given OP's sample, there is no single record matching because the NUL character occurs between Start and End sections...

    $ cat -A ip.txt 
    ItsTestStartFrom^@HereEndNotVisibleStart^@From$
    HereEndOkNotVisible$
    Start 3243$
    asdf End asd$
    $ grep -Pzo '(?s)Start.*?End' ip.txt
    Start 3243
    asdf End$