linuxbashfilenamesfile-renamebatch-rename

How to remove non/alphanumeric characters from multiple file names which differ


How do I remove a string of alpha, numeric and non-alphanumeric characters from multiple file name, where the values differ for each file:

Example of files:

11SP60_H5LMLDSX7_AAGATACACG-TGTTAGCACA_L004_R1.fastq.gz
12HH32_H5LMLDSX7_TGCAATGAAT-TTACTTCTGG_L001_R2.fastq.gz
B00699_H5LMLDSX7_CCGCTCCGTT-CTTCGCCGTA_L002_R1.fastq.gz 
A80101_H5LMLDSX7_TAGGTATGTT-CTTGGTCTCG_L003_R1.fastq.gz

Example of what I want the output to be:

11SP60_L004_R1.fastq.gz
12HH32_L001_R2.fastq.gz
B00699_L002_R1.fastq.gz 
A80101_L003_R1.fastq.gz

I am unable to use remove function due to usage rights. Thank you!


Solution

  • You will need to loop over the filenames. In bash that would be:

    for f in *.fastq.gz ; do
        newf=$(echo $f|  sed 's/\([^_]*\)_[^_]*_[^_]*_\(.*\)/\1_\2/')
        echo "mv $f $newf"
    done
    

    Look at the output. It is a good idea if you're not very experienced in bash to try an echo first. Otherwise, you may get unwanted results if you make a mistake.

    When you are satisfied that the new name is correct, replace the

    echo "mv $f $newf
    

    with

    mv "$f" "$newf"