bashbioinformaticsdna-sequencegff

Handleing gff file from MISA


Replace whole column in BED file with motif length

I was mining STR using MISA and I collected data from gff file to make a BED file including 5 column. Chromosome|Start|End|Motif length|Motif. But the 4th column showed Times of repeat example of my BED file

I want to replace 4th column in to Motif length.

for i in perfect.SSR_MISA.bed; do awk '{OFS="\t"} n=$5 q=$(expr length "$n") {print $1, $2, $3, q, $5}' >> perfect.SSR_MISA.bed; sleep 1; done

I tried this buts it doesn't work


Solution

    1. Don't use a loop, just use the file as a parameter. awk will loop itself over the lines
    2. Your length expression is wrong, see: https://riptutorial.com/awk/example/17378/length--string--
    3. Don't use the same file name for input and output. This will just empty your input file
    awk '{OFS="\t"} q=length($5) {print $1, $2, $3, q, $5}' perfect.SSR_MISA.bed > perfect.SSR_MISA.result.bed