I have a large codon alignment that has a variety of gene names in the headers. The headers are in the following format:
>ENST00000357033.DMD.-1 | CODON | REFERENC
I want to modify all of the headers in the fasta to exclude all characters after the first "." and before the first "|". Desired outcome:
>ENST00000357033 | CODON | REFERENC
I've tried a few sed commands, no dice. Any advice? I'm averse to using awk, since I'd like to keep the formatting of the alignment and awk scares me.
Thank you!
sed '/^>/s/\.[^ ]* / /'
for each line starting with a '>' replace 'dot' followed by some char different from spaces followed by a space, by a space.