bashawk

How can I pass filename prefix to AWK command for file split?


I am using this AWK command to produce a file for each value in the first column of input:

awk -F "," '{print > $1 ".csv" }'  test.csv

test.csv content

1,Rahul,
2,Atul,
3,Sachin,
4,Reyansh,
1,Rahul,
3,Sachin 

This produces output files named 1.csv, 2.csv, 3.csv and 4.csv.

However, my requirement is that I need to include a prefix to get sourcefile1.csv, sourcefile2.csv, sourcefile3.csv and sourcefile4.csv.

The filename (sourcefile in the example) is in a shell variable $fle_name.

I tried to include it in the AWK program this way:

awk -F "," '{print > $1 "`echo $fle_name`.csv" }'  test.csv

But this produces $fle_name1.csv, $fle_name2.csv etc.


Solution

  • Shell variables are not expanded in single quotes so awk prints the literal value of 1`echo $fle_name`.csv and so on. In most cases though it's a good practice to enclose awk command between single quotes so that it doesn't conflict with shell syntax. For this reason we could pass the shell variable to awk using -v:

    awk -F "," -v file="$fle_name" '{print > $1 file ".csv" }' test.csv
    

    Example:

    $ fle_name=MY_FILE
    $ gawk -F "," -v file="$fle_name" '{print > $1 file ".csv" }' test.csv
    $ ls -l
    total 20
    -rw-rw-r-- 1 user user 18 Mar 26 11:04 1MY_FILE.csv
    -rw-rw-r-- 1 user user  8 Mar 26 11:04 2MY_FILE.csv
    -rw-rw-r-- 1 user user 19 Mar 26 11:04 3MY_FILE.csv
    -rw-rw-r-- 1 user user 11 Mar 26 11:04 4MY_FILE.csv
    -rw-rw-r-- 1 user user 56 Mar 26 11:00 test.csv