awkvcf-variant-call-format

awk to skip lines up to and including pattern


I am trying to use awk to skip all lines including a specific pattern /^#CHROM/ and start processing on the line below. The awk does execute but currently returns all lines in the tab-delimited file. Thank you :).

file

##INFO=<ID=ANN,Number=1,Type=Integer,Description="My custom annotation">
##source_20170530.1=vcf-annotate(r953) -d key=INFO,ID=ANN,Number=1,Type=Integer,Description=My custom annotation -c CHROM,FROM,TO,INFO/ANN
##INFO=<ID=,Number=A,Type=Float,Description="Variant quality">
#CHROM  POS ID  REF ALT
chr1    948846  .   T   TA  NA  NA
chr2    948852  .   T   TA  NA  NA
chr3    948888  .   T   TA  NA  NA

awk

awk -F'\t' -v OFS="\t" 'NR>/^#CHROM/ {print $1,$2,$3,$4,$5,"ID=1"$6,"ID=2"$7}' file

desiered output

chr1    948846  .   T   TA  ID1=NA  ID2=NA
chr2    948852  .   T   TA  ID1=NA  ID2=NA
chr3    948888  .   T   TA  ID1=NA  ID2=NA

Solution

  • awk 'BEGIN{FS=OFS="\t"} f{print $1,$2,$3,$4,$5,"ID1="$6,"ID2="$7} /^#CHROM/{f=1}' file
    

    See https://stackoverflow.com/a/17914105/1745001 for details on this and other awk search idioms. Yours is a variant of "b" on that page.