Original data contains 4 times repeated characters, seperated by a space. For example,
code2 1 1 1 1 7 7 7 7 10 10 10 10 eq
code9 a a a a tpp1 tpp1 tpp1 tpp1 es
I'd like to add suffix to pairs using perl or linux shell scripting, but have difficulties to catch pairs correctly.
Ideal results are,
code2 1[1] 1[2] 1[3] 1[4] 7[1] 7[2] 7[3] 7[4] 10[1] 10[2] 10[3] 10[4] eq
code9 a[1] a[2] a[3] a[4] tpp1[1] tpp1[2] tpp1[3] tpp1[4] es
Could you sugguest implementation ideas or some reg expression for this case?
A two-pass approach is necessary. I'd use something like this:
s{
(?: ^ | \s )
\K
( \S+ )
(?: \s+ \1 ){3}
(?= \s | $ )
}{
my $i = 0;
$& =~ s/\S+\K/ "[".(++$i)."]" /ger
}xge;
Demo:
{
echo 'code2 1 1 1 1 7 7 7 7 10 10 10 10 eq'
echo 'code9 a a a a tpp1 tpp1 tpp1 tpp1 es'
} |
perl -pe'
s{
(?: ^ | \s )
\K
( \S+ )
(?: \s+ \1 ){3}
(?= \s | $ )
}{
my $i = 0;
$& =~ s/\S+\K/ "[".(++$i)."]" /ger
}xge
'
code2 1[1] 1[2] 1[3] 1[4] 7[1] 7[2] 7[3] 7[4] 10[1] 10[2] 10[3] 10[4] eq
code9 a[1] a[2] a[3] a[4] tpp1[1] tpp1[2] tpp1[3] tpp1[4] es
The program can be squished into a single line if you so desire.
s{(?:^|\s)\K(\S+)(?:\s+\1){3}(?=\s|$)}{$i=0;$&=~s/\S+\K/"[".(++$i)."]"/ger}ge