bashawkmawk

RS in awk language


I'm learning awk programming language and i'm stuck to a problem here.

I've a file(awk.dat), having the following content:

Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Maecenas pellentesque erat vel tortor consectetur condimentum.
Nunc enim orci, euismod id nisi eget, interdum cursus ex.
Curabitur a dapibus tellus.
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
Aliquam interdum mauris volutpat nisl placerat, et facilisis.

I'm using below command:

awk 'BEGIN{RS="*, *";ORS="<<<---\n"} {print $0}' awk.dat

it's returning me the error:

awk: run time error: regular expression compile failed (missing operand)
*, *
    FILENAME="" FNR=0 NR=0

While, if i use the command: awk 'BEGIN{RS=" *, *";ORS="<<<---\n"} {print $0}' awk.dat, it's giving me the required result.

I need to understand this part: RS=" *, *", the meaning of the space between double-quotes and * before ,, due to which it's throwing the error.

Expected Output:

Lorem ipsum dolor sit amet<<<---
consectetur adipiscing elit.
Maecenas pellentesque erat vel tortor consectetur condimentum.
Nunc enim orci<<<---
euismod id nisi eget<<<---
interdum cursus ex.
Curabitur a dapibus tellus.
Lorem ipsum dolor sit amet<<<---
consectetur adipiscing elit.
Aliquam interdum mauris volutpat nisl placerat<<<---
et facilisis.
<<<---

Thanks.


Solution

  • "[space1]*,[space2]*"
    

    is a regex, it matches string with:

    zero or many spaces (space1) followed by a comma, then followed by zero or many spaces (space2)

    The first one "*,[space]*" was wrong, because * has special meaning in regex. It means that repeating the matched group/character zero or many times. You cannot put it at very beginning.