vim

Vim command to delete everything except for the digits between parentheses


I have a file that looks like this:

Stroszek;12/01/1977;5.74;Drama,Comedy;7.142;Bruno S.(2),Eva Mattes(1),50 Cent(2),Wilhelm von Homburg(2),Burkhard Driest(2),Clayton Szalpinski(0),Ely Rodriguez(0),Alfred Edel(0),Scott McKain(0),Pitt Bedewitz(0),Ralph Wade(0),Michael Gahr(2),Vaclav Vojta(0),Yüksel Topcugürler(0)
Patients;01/03/2017;5.889;Drama,Comedy;25.000;Pablo Pauly(2),Soufiane Guerrab(2),Moussa Mansaly(2),Nailia Harzoune(1),Franck Falise(0),Yannick Renier(2),Alban Ivanov(2),Jason Divengele(0),Côme Levin(2),Dominique Blanc(1),Anne Benoît(1),Rabah Nait Oufella(2)
Carnage;16/09/2011;9.4;Drama,Comedy;45.454;Kate Winslet(1),Jodie Foster(1),Christoph Waltz(2),John C. Reilly(2),Elvis Polanski(0),Eliot Berger(0),Julie Adams(1),Joseph Rezwin(2),Tanya Lopert(1),Nathan Rippy(2),Lexie Kendrick(1)

And I want to convert it using a vim command so it looks like this:

Stroszek;12/01/1977;5.74;Drama,Comedy;7.142;21222000000200
Patients;01/03/2017;5.889;Drama,Comedy;25.000;222102202112
Carnage;16/09/2011;9.4;Drama,Comedy;45.454;11220012121

The digits at the end of what I want my file to look like are the digits between parentheses in the first file. The problem is that because of the '50' from 50 cent in the first line I can't just delete everything but the numbers in the sixth field because that would keep the 50.

I tried commands like :

:%s/\v(.*)(\(\d\))(.*)/\=substitute(submatch(2), '\D', '', 'g')/g

and :

:%s/\v(.*)(\(\d\))(.*)/\2/g

but none of them seem to work properly. Does anybody know how to do this?


Solution

  • The main reason why your commands don't work is that your pattern is too greedy. It matches the whole line and the only \2 that is left is the last one.

    Making the pattern less greedy certainly helps:

    \v(.{-})(\(\d\))(.{-})
    

    but it still discards the 5 first fields, which is not what you want.

    Here is a working substitution:

    :s/^\(.\{-};\)\{,5\}\zs.*/\=substitute(submatch(0),'.\{-}(\(\d*\))','\1','g')