I have a large file. Most lines are like this (record number dot space last name, first name)
1. Moore, Roger
2. Connery, Sean
3. ....
100. Dalton, Timothy
.. Occasionally some unpleasant lines are like this
110. Bronson, Pierce 111. Gomez, Selena 112. Portman, Nathalie
I need a regular expression to break those unpleasent lines to like this
110. Bronson, Pierce
111. Gomez, Selena
112. Portman, Nathalie
Some lines may have two records, but some may have five or more records like that. How did I get them, when I copy/paste pdf document into Textwrangler some lines come up like that. I use text wrangler.
I haven't used Text Wrangler in years, but it has regex capabilities. You need to Find and Replace with a regex.
Here is a working regex that shows the identification of all the lines with extra numbered entries.
You want to replace what it matches with something like
\n$1
where the \n
is a newline character and the $1
is the text captured in the match, so it should result in
- Bronson, Pierce 111. Gomez, Selena 112. Portman, Nathalie
going to
- Bronson, Pierce
- Gomez, Selena
- Portman, Nathalie