I am trying to make a string replacement using regex
, but I am not able to achieve the expected result.
I have a list of strings ['INSUTRIN 150G','NEUROZINE 30 ML','INSULITROL 30 CAPS'] for example. I would like to transform this into:
['INSUTRIN','NEUROZINE','INSULITROL']
I tried the following approach using regex:
import re
def modify_string(x):
return re.sub("\d{1,}\s{0,}[[ML]|G|KG|CAPS]", "", x).strip()
and get the results:
['INSUTRIN','NEUROZINE L','INSULITROL APS']
How could I set regex properly for that?
If all you want is to remove the number and unit part in the end, then you can simply replace \s\d+.*
for each string:
import re
def modify_string(x):
return re.sub(r"\s\d+.*", "", x).strip()
things = ['INSUTRIN 150G', 'NEUROZINE 30 ML', 'INSULITROL 30 CAPS']
things = list(map(modify_string, things))
print(things)
Output:
['INSUTRIN', 'NEUROZINE', 'INSULITROL']
Additionaly:
def modify_string(x):
return x.split()[0]
ML,G,KG,CAPS
then:def modify_string(x):
return re.sub(r"\s\d+.*(ML|G|KG|CAPS)", "", x).strip()