pythonregex

Match a line with multiple regex using Python


Is there a way to see if a line contains words that matches a set of regex pattern? If I have [regex1, regex2, regex3], and I want to see if a line matches any of those, how would I do this? Right now, I am using re.findall(regex1, line), but it only matches 1 regex at a time.


Solution

  • You can use the built in functions any (or all if all regexes have to match) and a Generator expression to cycle through all the regex objects.

    any (regex.match(line) for regex in [regex1, regex2, regex3])

    (or any(re.match(regex_str, line) for regex in [regex_str1, regex_str2, regex_str2]) if the regexes are not pre-compiled regex objects, of course)

    However, that will be inefficient compared to combining your regexes in a single expression. If this code is time- or CPU-critical, you should try instead to compose a single regular expression that encompasses all your needs, using the special | regex operator to separate the original expressions.

    A simple way to combine all the regexes is to use the string join method:

    re.match("|".join([regex_str1, regex_str2, regex_str2]), line)

    A warning about combining the regexes in this way: It can result in wrong expressions if the original ones already do make use of the | operator.