pythonregexsubstringpunctuation

Extract substring from dot untill colon with Python regex


I have a string that resembles the following string:

'My substring1. My substring2: My substring3: My substring4'

Ideally, my aim is to extract 'My substring2' from this string with Python regex. However, I would also be pleased with a result that resembles '. My substring2:'

So far, I am able to extract

'. My substring2: My substring3:'

with

"\.\s.*:"

Alternatively, I have been able to extract - by using Wiktor Stribiżew's solution that deals with a somewhat similar problem posted in How can i extract words from a string before colon and excluding \n from them in python using regex -

'My substring1. My substring2'

specifically with

r'^[^:-][^:]*'

However, I have been unable, after many hours of searching and trying (I am quite new to regex), to combine the two results into a single effective regex expression that will extract 'My substring2' out of my aforementioned string.

I would be eternally greatfull if someone could help me find to correct regex expression to extract 'My substring2'. Thanks!


Solution

  • You can use non-greedy regex (with ?):

    import re
    
    s = "My substring1. My substring2: My substring3: My substring4"
    
    print(re.search(r"\.\s*(.*?):", s).group(1))
    

    Prints:

    My substring2