Issue
I am doing a practice question on a Regex course:
How would you write a regex that matches a sentence where the first word is either Alice, Bob, or Carol; the second word is either eats, pets, or throws; the third word is apples, cats, or baseballs; and the sentence ends with a period? This regex should be case-insensitive. It must match the following:
- Alice eats apples.
- Bob pets cats.
- Carol throws baseballs.
- Alice throws Apples.
- BOB EATS CATS.
My code is as follows:
regex=re.compile(r'Alice|Bob|Carol\seats|pets|throws\sapples\.|cats\.|baseballs\.',re.IGNORECASE)
mo=regex.search(str)
ma=mo.group()
When I pass str ='BOB EATS CATS.'
or 'Alice throws Apples.'
, mo.group()
only returns 'Bob'
or 'Alice'
respectively, but I was expecting it to return the whole sentence.
When I pass str='Carol throws baseballs.'
, mo.group()
returns 'baseballs.'
, which is the last match.
I am confused as to why:
For the first two str examples I passed, it returned the first match(
'Bob'
or'Alice'
), whilst the 3rd str example I passed returned the last match ('baseball'
)?In all 3 str examples, I'm not sure why
mo.group()
is not returning the entire sentence as the match. i.e. i was expecting'Carol throws baseballs.'
as output frommo.group()
Solution
You need to tell your regex to group the lists of options somehow, or it will naturally think it's one giant list, with some elements containing spaces. The easiest way is to use capture groups for each word:
regex=re.compile(r'(Alice|Bob|Carol)\s+(eats|pets|throws)\s+(apples|cats|baseballs)\.', re.IGNORECASE)
The trailing period shouldn't be part of an option. If you don't want to use capturing groups for some reason (it won't really affect how the match is made), you can use non-capturing groups instead. Replace (...)
with (?:...)
.
Your original regex was interpreted as the following set of options:
Alice
Bob
Carol\seats
pets
throws\sapples.
cats.
baseballs.
Spaces don't magically separate options. Hopefully you can see why none of the elements of Carol throws baseballs.
besides baseballs.
is present in that list. Something like Carol eats baseballs.
would match Carol eats
though.
Answered By - Mad Physicist
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.