Issue
I want to return all the words which start and end with letters or numbers. They may contain at most one period .
OR hypen -
in the word.
So, ab.ab
is valid but ab.
is not valid.
import re
reg = r"[\d\w]+([-.][\d\w]+)?"
s = "sample text"
print(re.findall(reg, s))
It is not working because of the parenthesis. How can I apply the ?
on combination of [-.][\d\w]+
Solution
If ab.
is not valid and should not be matched and the period or the hyphen should not be at the start or at the end, you could match one or more times a digit or a character followed by an optional part that matches a dot or a hyphen followed by one or more times a digit or a character.
(?<!\S)[a-zA-Z\d]+(?:[.-][a-zA-Z\d]+)?(?!\S)
Explanation
(?<!\S)
Negative lookbehind to assert that what is on the left is not a non whitespace character[a-zA-Z\d]+
Match one or more times a lower/uppercase character or a digit(?:[.-][a-zA-Z\d]+)?
An optional non capturing group that would match a dot or a hypen followed by or more times a lower/uppercase character or a digit(?!\S
Negative lookahead that asserts that what is on the right is not a non whitespace character.
Answered By - The fourth bird
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.