I am trying to extract all lines from a file that contain a string using a for loop with a file that contains a list of possible strings. I also want to export the results of grep to a new file with the variable in the file name.
Here is what I have:
file="variables.txt"
listofvariables=$(cat ${file})
for variable in ${listofvariables}
do
samtools view sample.bam | \
grep "'${variable}'" \
> sample.${variable}.bam
done
What this code does is simply make a blank file for every variable. Why isn't grep extracting lines that contain that variable and putting it into those files?
For reference, here is what the variables.txt
file looks like:
mmu-let-7g-5p
mmu-let-7g-3p
mmu-let-7i-5p
mmu-let-7i-3p
mmu-miR-1a-1-5p
mmu-miR-1a-3p
mmu-miR-15b-5p
mmu-miR-15b-3p
mmu-miR-23b-5p
mmu-miR-23b-3p
And here is what the samtools view
output looks like:
7238520-1_CATAAT.mmu-miR-125b-5p 0 chr1 11301523 60 75M * 0 0CAGGTGTTTTCTCAGGCATTTGGATTTCTATAGAATCATAGTATTAAAATTTCAAAGTAATAACATTGCTTTTTA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:75 YT:Z:UU NH:i:1
1422982-2_CCCCGC.mmu-miR-132-3p 0 chr1 11301726 60 97M * 0 0 AAGTCTGTTTTTATGTGAGTGTTCCTGTGAAACTGAGGTCTGATGACTCTTCCTTAAGCAATTACAACTTCATTAGCATACATAAGGTTCAATTAAA IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:97 YT:Z:UU NH:i:1
5675450-1_CCCCGC.mmu-miR-132-3p 0 chr1 11301726 60 97M * 0 0 AAGTCTGTTTTTATGTGAGTGTTCGTGTGAAACTGAGGTCTGATGACTCTTCCTTAAGCAATTACAACTTC^C
For those who may be unfamiliar samtools view
simply reads out the .bam
file. You can think of it like cat
.
Thanks in advance!
Since ...
What this code does is simply make a blank file for every variable.
... you know that your variables file is being read correctly, and your for
loop is correctly iterating over the results. That the resulting files are empty indicates that grep
is not finding any matches to your pattern.
Why not? Because the pattern in your grep
command ...
grep "'${variable}'" \
... doesn't mean what you appear to think it means. You have taken some pains to get literal apostrophes ('
) into the pattern, but these have no special meaning in that context. Your pattern does not match any lines because in the data, there are no apostrophes around the appearances of the target strings.
This would be better:
grep -F -e "${variable}" \
The -F
option tells grep
to treat the pattern as a fixed string to match, so that nothing within is interpreted as a regex metacharacter. The -e
ensures that the pattern is interpreted as such, even if, for example, it begins with a -
character. The double quotes remain, as they are required to ensure that the shell does not perform word splitting on the expanded result, and of course the inner apostrophes are gone, since they were causing the main problem.