pythonstringsplitpartitioning

How to separate the string by specific symbols and write it to list?


I have the following string:

my_string='11AB2AB33'

I'd like to write this string in a list, so 'AB' is a single element of this list in the following way:

['1', '1', 'AB', '2', 'AB', '3', '3']

I tried to do it by

list(my_string)

but the result wasn't what I expected:

['1', '1', 'A', 'B', '2', 'A', 'B', '3', '3']

I also tried partition method:

list(my_string.partition('AB'))

and also didn't get expected result

['11', 'AB', '2AB33']

Solution

  • You can use re.findall with an alternation using the pipe | matching either AB or a non whitespace character \S If you also want to match spaces you can use a . instead of \S

    You can see the matches here on regex101.

    my_string='11AB2AB33'
    print(re.findall(r'AB|\S', my_string))
    

    Output

    ['1', '1', 'AB', '2', 'AB', '3', '3']
    

    If you want to use split, and you only have characters A-Z or digits 0-9, you could use non word boundary to get a positon where directly on the right is either AB or a digit.

    You can see the matches here on regex101

    my_string='11AB2AB33'
    print(re.split(r"\B(?=AB|\d)", my_string))
    

    Output

    ['1', '1', 'AB', '2', 'AB', '3', '3']