I tried to reorganize the format of a file containing:
>Humanl|chr16:86430087-86430726 | element 1 | positive
>Humanl|chr16:85620095-85621736 | element 2 | negative
>Humanl|chr16:80423343-80424652 | element 3 | negative
>Humanl|chr16:80372593-80373755 | element 4 | positive
>Humanl|chr16:79969907-79971297 | element 5 | negative
>Humanl|chr16:79949950-79951518 | element 6 | negative
>Humanl|chr16:79026563-79028162 | element 7 | negative
>Humanl|chr16:78933253-78934686 | element 9 | negative
>Humanl|chr16:78832182-78833595 | element 10 | negative
My command is:
awk '{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}'
Here is the output:
>Human|chr16:86430087-86430726 | element 1 |
>Human chr16:85620095-85621736 element 2 negative
>Human chr16:80423343-80424652 element 3 negative
>Human chr16:80372593-80373755 element 4 positive
>Human chr16:79969907-79971297 element 5 negative
>Human chr16:79949950-79951518 element 6 negative
>Human chr16:79026563-79028162 element 7 negative
>Human chr16:78933253-78934686 element 9 negative
>Human chr16:78832182-78833595 element 10 negative
Every line works fine except for the first line. I don't understand why this happened.
Can someone help me with it? Thanks!
FS
and OFS
are set too late to affect the first line, use something like this instead:
awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t'
You can also use this shorter version:
awk -v FS='|' -v OFS='\t' '$1=$1'
It doesn't work because awk has already performed record/field splitting at the time when FS
and OFS
are set. You can force a re-splitting by setting $0
to $0
, e.g.:
awk '{FS="|";OFS="\t";$0=$0} {print $1,$2,$3,$4,$5}'
The conventional ways to do this are 1. set FS
and others in the BEGIN
clause, 2. set them through the -v VAR=VALUE
notation, or 3. append them after the script as VAR=VALUE
. My preferred style is the last alternative:
awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t'
Note that there is a significant difference between when -v
and post-script variables are set. -v
will set variables before the BEGIN
clause whilst post-script setting of variables are set just after the BEGIN
clause.