Issue
Note: I'm using pypi regex module
I have the following regex pattern (flags V1 + VERBOSE
):
(?(DEFINE)
(?P<id>[\d-]+)
)
id:\s(?&id)(,\s(?&id))*
How can I retrieve all the times the <id>
group matched ?
For example, in the following text:
don't match this date: 2020-10-22 but match this id: 5668-235 as well as these id: 7788-58-2, 8688-25, 74-44558
I should be able to retrieve the following values:
["5668-235", "7788-58-2", "8688-25", "74-44558"]
Note that this regex match the patterns, but I would like to retrieve everytime a specific group has been matched (even if it is multiple times in the same match object).
Solution
The named capturing groups used inside DEFINE
block are used as building blocks later in the pattern, they do not actually capture the text they match when used in the consuming pattern part.
In this particular case, you can use
(?(DEFINE)
(?P<id>[\d-]+)
)
id:\s+(?P<idm>(?&id))(?:,\s+(?P<idm>(?&id)))*
See this regex demo. The point is using additional named capturing group, I named it idm
, you may use any name for it.
See the Python demo:
import regex
pat = r'''(?(DEFINE)
(?P<id>[\d-]+)
)
id:\s+(?P<idm>(?&id))(?:,\s+(?P<idm>(?&id)))*'''
text = r"don't match this date: 2020-10-22 but match this id: 5668-235 as well as these id: 7788-58-2, 8688-25, 74-44558"
print( [x.captures("idm") for x in regex.finditer(pat, text, regex.VERBOSE)] )
# => [['5668-235'], ['7788-58-2', '8688-25', '74-44558']]
Answered By - Wiktor Stribiżew
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.