pythonpython-3.xregex

Python regex replacement with str.replace. Pattern (wordhypenword) Replacement (wordspacehypenspaceword)


I have the following regex pattern word-word, so

r'\w+\-\w+ 

I would like to replace it with

r'\w+\s\-\s\w+

Example: I would like to change

hello-friends to hello - friends 

I have tried the following with no success

df['mytextcolumn'].str.replace(r'(\\w+)(\\-)(\\w+)',r'(\\w+)(\\s)(\\-)(\\s)(\\w+)')

also tried with re.sub

re.sub(r'\\w+\\-\\w+',r'\\w+\\s\\-\\s\\w+','hello-friends') 

but I still get back hello-friends, not hello - friends

I also checked my regex with an online regex matcher for python, and it picks up the patterns correctly, so I am confused why I am unable to replace it within my script.


Solution

  • You can not use a new pattern in the replacement. Instead you can use 2 capture groups in the initial pattern, and use \1 - \2 in the replacement.

    You can capture - also in a group, but as it is a single character that you are literally matching you can also just use that in the replacement.

    (\w+)-(\w+)
    

    See a regex demo

    df['mytextcolumn'] = df['mytextcolumn'].str.replace(r'(\w+)-(\w+)',r'\1 - \2', regex=True)
    print(df)
    

    Output

          mytextcolumn
    0  hello - friends