I have some python code where i want to scan and split string on first occurrence of non-allowable characters.
import re,string
mystring="my_id=abc-something_123&anything#;?lcdkahck;my_id%3Dkckdkkj_bcjc"
if "my_id=" in mystring:
mystring = mystring[mystring.index("my_id=") + 6 : len(mystring)][0:100]
mystring = re.split('[;&#]', mystring)[0]
print(mystring)
What happens in this, I get string correctly where ;&# is coming, but my data can have any unpredictable character put of ;&#.
What i tried drive out these characters
allowable_character = '-' + '_' + string.ascii_letters + string.digits
mystring = re.sub('[^%s]' % allowable_character, '', mystring)
print(mystring)
But this just filters the string with characters that are not in 'allowable_character'.
What i am trying to achieve is to split string once the character which is not in 'allowable_character' and return that string.
So I want expected output as 'abc-something_123'
Any help is appreciated here
You could just use re.findall
here:
mystring = "my_id=abc-something_123&anything#;?lcdkahck;my_id%3Dkckdkkj_bcjc"
match = re.findall(r'^my_id=([\w-]*).*$', mystring)[0]
print(match)
This prints:
'abc-something_123'