Stepping through regex pattern matches in Python: -
in python:
- assignment not allowed in conditionals.
- the state of regex match determined based on returned match object contains other match info.
now want match particular pattern out of 10 or 15, end cluttered this:
m = pat1.match(buffer) if m: tok = tok1 val = m.group(0) else: m = pat2.match(buffer) if m: tok = tok2 val = m.group(0) else: m = pat3.match(buffer) if m: tok = tok3 val = m.group(0) # processing here , there - makes looping unsuitable else: m = pat4.match(buffer) if m: tok = tok4 val = m.group(0) else: # ... keep indenting
we have follows:
if match ... pat1: tok = val = elif match ... pat2: tok = val = elif match ... pat3: tok = val = ...
(like can done in other languages possibly using features like: assignment in conditionals, side effect standard match object, different form of match function pass reference args ...)
we can maybe use loop run through patterns, wouldn't suitable if there variations in processing each match.
so: there nice pythonic way keep match conditionals @ same level?!
loop on token , patterns in pairs, can adjust following:
for pat, token in zip([pat1, pat2, pat3], ['tok1', 'tok2', 'tok3']): m = pat.match(buffer) if m: val = m.group(0) tok = token1 break
the idea build table before hand of pattern -> values:
tests = [ (re.compile('([a-z]{2})'), 'func1'), (re.compile('(a{5}'), 'token2') ] pattern, token in tests: m = pattern.match(buffer) if m: # whatever
this further extended provide callable instead take compiled object , buffer arguments , whatever wants , returns value.
eg:
def func1(match, buf): print 'entered function' return int(buf) * 50 tests = [ (re.compile('\d+'), func1) ] pattern, func in tests: m = pattern.match(buffer) if m: result = func(m, buffer)
Comments
Post a Comment