Saturday, August 24, 2019

Match a line with multiple regex using Python



Is there a way to see if a line contains words that matches a set of regex pattern?
If I have [regex1, regex2, regex3], and I want to see if a line matches any of those, how would I do this?
Right now, I am using re.findall(regex1, line), but it only matches 1 regex at a time.


Answer



You can use the built in functions any (or all if all regexes have to match) and a Generator expression to cicle through all the regex-objects.




any (regex.match(line) for regex in [regex1, regex2, regex3])



(or any(re.match(regex_str, line) for regex in [regex_str1, regex_str2, regex_str2]) if the regexes are not pre-compiled regex objects, of course)



Although that will be ineficient compared to combining your regexes in a single expression - if this code is time or cpu critical, you should try instead, composing a single regular expression that encompass all your needs, using the special | regex operator to separate the original expressions.
A simple way to combine all the regexs is to use the string "join" operator:



re.match("|".join([regex_str1, regex_str2, regex_str2]) , line)




Although combining the regexes on this form can result in wrong expressions if the original ones already do make use of the | operator.


No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?

Laptop was acting really weird, and copy and seek times were really slow, so I decided to scan the hard drive surface. I have a couple hundr...