substringshortest

How to find the shortest of substring of string before a certain text in python 3


I am trying to extract the shortest substring of a string before a certain text in Python 3. For instance, I have the following string.

\\n...\\n...\\n...TEXT

I want to extract the shortest sub-string of the string that contains exactly two \\n before 'TEXT'. The example text may have random number of \\n and random letters between \\n.

I have already tried this in Python 3.4 but I get the result as the original text. It seems like when I try the code, it finds the first '\n' as the first search find and treats rest of '\n' as just any other texts.

text='\\n abcd \\n efg \\n hij TEXT'

pattern1=re.compile(r'\\n.\*?\\n.\*?TEXT', re.IGNORECASE)

obj = re.search(pattern1, text)

obj.group(0)

When I try my code, I get the result as \\n abcd \\n efg \\n hij TEXT which is exactly same as the input.

I would like to result to be

\\n efg \\n hij TEXT

Can anyone help me with this?


Solution

  • Using regex with negative lookahead:

    import re
    
    text = '\\n abcd \\n efg \\n hij TEXT'
    pattern = re.compile(r'(\\n(?!.*\\n.*\\).*)')
    res = re.search(pattern, str(respData))
    res.group(0)
    

    Using python methods:

    text = '\\n abcd \\n efg \\n hij TEXT'
    text[text[:text.rfind("\\n")].rfind("\\n"):]