How do I remove a string of alpha, numeric and non-alphanumeric characters from multiple file name, where the values differ for each file:
Example of files:
11SP60_H5LMLDSX7_AAGATACACG-TGTTAGCACA_L004_R1.fastq.gz
12HH32_H5LMLDSX7_TGCAATGAAT-TTACTTCTGG_L001_R2.fastq.gz
B00699_H5LMLDSX7_CCGCTCCGTT-CTTCGCCGTA_L002_R1.fastq.gz
A80101_H5LMLDSX7_TAGGTATGTT-CTTGGTCTCG_L003_R1.fastq.gz
Example of what I want the output to be:
11SP60_L004_R1.fastq.gz
12HH32_L001_R2.fastq.gz
B00699_L002_R1.fastq.gz
A80101_L003_R1.fastq.gz
I am unable to use remove function due to usage rights. Thank you!
You will need to loop over the filenames. In bash that would be:
for f in *.fastq.gz ; do
newf=$(echo $f| sed 's/\([^_]*\)_[^_]*_[^_]*_\(.*\)/\1_\2/')
echo "mv $f $newf"
done
Look at the output. It is a good idea if you're not very experienced in bash
to try an echo
first. Otherwise, you may get unwanted results if you make a mistake.
When you are satisfied that the new name is correct, replace the
echo "mv $f $newf
with
mv "$f" "$newf"