I'm using regex re.findall(r"[0-9]+(.*?)\.\s(.*?)[0-9]+", text)
to get below text
8 EXT./INT. MONORAIL - MORNING 8
9 EXT. CITY SCAPE/MONORAIL - CONTINUOUS 9
But my current output doesn't have the prefix and suffix numbers. I'm trying to have the prefix digits also in the output as follows.
9 EXT. CITY SCAPE/MONORAIL - CONTINUOUS
Any help greatly appreciated! Thanks in advance.
(The current output is given below)
You can use
(?m)^([0-9]+)\s*(.*?)\.\s(.*?)(?:\s*([0-9]+))?$
See the regex demo. *Details:
(?m)
- a multiline modifier^
- start of string([0-9]+)
- Group 1: one or more digits\s*
- zero or more whitespaces(.*?)
- Group 2: zero or more chars other than line break chars as few as possible\.\s
- a dot and a whitespace(.*?)
- Group 3: zero or more chars other than line break chars as few as possible(?:\s*([0-9]+))?
- an optional occurrence of zero or more whitespaces and then Group 4 capturing one or more digits$
- end of line.