I'm relatively new to regex and I'm struggeling with a very specific appliciation. Say I have a string such as this:
"> The following will be split : This was split: But this wasn't: And neither is this > But this is again: Aswell as this"
I want to split this string at >'s and :'s alternatingly, that is split at the first > and at the first : but not at the :'s after that until the next > follows. Also, idealy, the >'s should be captured but not the :'s. (The symbols are placeholders for more complex patterns). For the record, that's:
['>', 'The following will be split', 'This was split: But this wasn't: And neither is this', '>', 'But this is again', 'Aswell as this']
How should I do that using a single regex expression?
use re
import re
re.findall("(>)([^:]+):([^>]+)", string)
[('>', ' The following will be split ', " This was split: But this wasn't: And neither is this "), ('>', ' But this is again', ' Aswell as this')]
If you want exact results do:
list(sum(re.findall("(>)([^:]+):([^>]+)", string), ()))
['>', ' The following will be split ', " This was split: But this wasn't: And neither is this ", '>', ' But this is again', ' Aswell as this']