pythonstrtok

How can I do strtok()-type parsing in Python?


The title of How do I do what strtok() does in C, in Python? suggests it should answer my question but the specific strtok() behavior I'm looking for is breaking on any one of the characters in the delimiter string. That is, given:

const char* delim = ", ";
str1 = "123,456";
str2 = "234 567";
str3 = "345, 678";

strtok() finds the substrings of digits regardless of how many characters from delim are present. Python's split expects the entire delimiting string to be there so I can't do:

delim = ', '
"123,456".split(delim)

because it doesn't find delim as a substring and returns a list of single element.


Solution

  • If you know that the tokens are going to be numbers, you should be able to use the split function from Python's re module:

    import re
    re.split("\D+", "123,456")
    

    More generally, you could match on any of the delimiter characters:

    re.split("[ ,]", "123,456")
    

    or:

    re.split("[" + delim + "]", "123,456")