I have this file:
m64071_220512_054244/12584899/ccs rev pet047-10055 ACGTGCGACCTTGTGA TTGAGGGTTCAAACGTGCGACCTTGTGA
m64071_220512_054244/128321000/ccs rev pet047-10055 ACGTGCGACCTTGTGA TTGAGGGTTCAAACGTGCGACCTTGTGA
m64071_220512_054244/132186699/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/134874748/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
I need to tr
and reverse
the fields $4 and $5 only if $2==rev
Expect:
m64071_220512_054244/12584899/ccs rev pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/128321000/ccs rev pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/132186699/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
m64071_220512_054244/134874748/ccs fwd pet047-10055 TCACAAGGTCGCACGT TCACAAGGTCGCACGTTTGAACCCTCAA
I tried:
perl -lpe 'if(/rev/) {$rev=/rev/;next}; if ($rev) {$F[4,5]=~tr/ATGC/TACG/; $F[4,5]=reverse $F[4,5]; print "@F"}' file
I also tried to go with Awk (Execute bash command inside awk and print command output)
awk '{
if($2==rev)
{
cmd1="echo \047" $4 "\047 | rev | tr \047ATGC\047 \047TACG\047"
cmd2="echo \047" $5 "\047 | rev | tr \047ATGC\047 \047TACG\047"
newVar1=((cmd1 | getline line) > 0 ? line : "failed")
newVar2=((cmd2 | getline line) > 0 ? line : "failed")
close(cmd)
print $1, $2, $3, newVar1, newVar2
}
else {print}
}' file
To follow the question's attempt:
perl -w -lanE'
if ($F[1] eq "rev") {
for (@F[3,4]) { tr/ATGC/TACG/; $_ = reverse $_ }
}
say "@F"
' file
Can put this on one line, I spread it out for readability (or copy-paste it as is into most shells). Or, put it in a program file of course, specially if there is more to do.
The code in the block of that loop over @F[3,4]
elements runs on each element of the array and modifies it in place (since $_
is just an alias to array elements), first with tr
then by (reversing and) assigning. All that can also be written as
$_ = reverse tr/ATGC/TACG/r for @F[3,4];
The r
modifier makes tr
return the changed string, which is then reverse
-ed and then assigned back to the currently processed array element, via the $_
alias for it
Comments on the code in the question
To have it break up the input string into the @F
array ("autosplit") need -a
flag
Since you are explicitly printing what is needed use -n
flag, not -p
Fields 4 and 5 in the line are array elements 3 and 4
I assume that by $F[4,5]
you mean the two array elements (which should be 3,4). That, then, should be @F[3,4]
-- and with -w
flag, for warnings, we'd hear about it
More importantly, we cannot bind a regular expression or a tr
pattern to a list, but only to a single scalar.† In order to apply that tr
to multiple items need to iterate over them, like above
† With @ary =~ /.../
the LHS is first evaluated so we get something like
$ary[0], ..., $ary[-1] =~ /.../;
All terms up to the last one are evaluated in void context and discarded (see the comma operator for the logic of it), one by one and with a warning (if warnings are on!), and then $ary[-1] =~ /.../
is evaluated and its return returned from the whole expression.
The same goes for LIST =~ /.../;
(with parens around LIST
or not)