Issue
I'm trying to make a regex where I have some duplicated group names, for instance, in the example below I want to find the values of ph
, A
and B
such that if I replace them in the pattern
, I retrieve string
. I do this using regex
, as the default re
library of Python does not allow to duplicate names.
pattern = '(?P<ph>.*?) __ (?P<A>.*?) __ (?P<B>.*?) __ \( (?P<ph>.*?) \-> (?P<A>.*?) = (?P<B>.*?) \) \)'
string = 'y = N __ ( A ` y ) __ ( A ` N ) __ ( y = N -> ( A ` y ) = ( A ` N ) ) )'
match = regex.fullmatch(pattern, string)
for k, v in match.groupdict().items():
print(f'{k}: {v}')
And I retrieve the expected output:
ph: y = N
A: ( A ` y )
B: ( A ` N )
My concern, is that there seems to be some issues with this library, or I'm not using it properly. For instance, if I replace string
with:
string = 'BLABLA __ ( A ` y ) __ ( A ` N ) __ ( y = N -> ( A ` y ) = ( A ` N ) ) )'
then the code above provides the exact same values for ph
, A
and B
, ignoring the BLABLA
prefix at the beginning of string
, and match
should be None
as there are no solutions.
Any ideas?
Note: more precisely, in my problemsI have pairs of patterns/strings (p_0, s_0) ... (p_n, s_n)
and I have to find a valid match across these pairs, so I concatenated them together with a __
delimiter, but I am also curious if there is a proper way to do this.
Solution
Since you want to make sure the first three groups are equal to the corresponding next three groups you need to use backreferences to the first three groups rather than use the identically named capturing groups again:
^(?P<ph>.*?) __ (?P<A>.*?) __ (?P<B>.*?) __ \( (?P=ph) \-> (?P=A) = (?P=B) \) \)$
See the regex demo
Here, (?P=ph)
, (?P=A)
and (?P=B)
are named backreferences that match the same text as captured into the groups with corresponding names.
The ^
and $
anchors are not necessary in your code since you use the regex.fullmatch
method, but you need them when you test your pattern online in a regex tester.
Answered By - Wiktor Stribiżew
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.