How To Combine Multiple Regex Into Single One In Python?
Solution 1:
You need to compile all your regex functions. Check this example:
import re
re1 = r'\d+\.\d*[L][-]\d*\s[A-Z]*[/]\d*'
re2 = '\d*[/]\d*[A-Z]*\d*\s[A-Z]*\d*[A-Z]*'
re3 = '[A-Z]*\d+[/]\d+[A-Z]\d+'
re4 = '\d+[/]\d+[A-Z]*\d+\s\d+[A-Z]\s[A-Z]*'
sentences = [string1, string2, string3, string4]
for sentence in sentences:
generic_re = re.compile("(%s|%s|%s|%s)" % (re1, re2, re3, re4)).findall(sentence)
Solution 2:
To findall
with an arbitrary series of REs all you have to do is concatenate the list of matches which each returns:
re_list = [
'\d+\.\d*[L][-]\d*\s[A-Z]*[/]\d*', # re1 in question,
...
'\d+[/]\d+[A-Z]*\d+\s\d+[A-z]\s[A-Z]*', # re4 in question
]
matches = []
for r in re_list:
matches += re.findall( r, string)
For efficiency it would be better to use a list of compiled REs.
Alternatively you could join the element RE strings using
generic_re = re.compile( '|'.join( re_list) )
Solution 3:
I see lots of people are using pipes, but that seems to only match the first instance. If you want to match all, then try using lookaheads.
Example:
>>> fruit_string = "10a11p">>> fruit_regex = r'(?=.*?(?P<pears>\d+)p)(?=.*?(?P<apples>\d+)a)'>>> re.match(fruit_regex, fruit_string).groupdict()
{'apples': '10', 'pears': '11'}
>>> re.match(fruit_regex, fruit_string).group(0)
'10a,11p'>>> re.match(fruit_regex, fruit_string).group(1)
'11'
(?= ...)
is a look ahead:
Matches if ... matches next, but doesn’t consume any of the string. This is called a lookahead assertion. For example, Isaac (?=Asimov) will match 'Isaac ' only if it’s followed by 'Asimov'.
.*?(?P<pears>\d+)p
find a number followed a p anywhere in the string and name the number "pears"
Solution 4:
You might not need to compile both regex patterns. Here is a way, let's see if it works for you.
>>>import re>>>text = 'aaabaaaabbb'>>>A = 'aaa'>>>B = 'bbb'>>>re.findall(A+B, text)
['aaabbb']
>>>
further read read_doc
Post a Comment for "How To Combine Multiple Regex Into Single One In Python?"