Issue
The script I am using calls s_lower
method to transform all text to lowercase but there is a catch: if it is a link (there is a special regex), then it does not lowercase it. So, I would like to apply the same or similar logic with other regex.
RE_WEBURL_NC = (
r"(?:(?:(?:(?:https?):)\/\/)(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1["
r"6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?"
r":[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z0-9][a-z0-9_-]{0,62})?[a-z0-9]\.)+(?:[a-z]{2,}\.?))(?::\d{2,5})?)(?:"
r"(?:[/?#](?:(?![\s\"<>{}|\\^~\[\]`])(?!<|>|"|').)*))?"
)
def s_lower(value):
url_nc = re.compile(f"({RE_WEBURL_NC})")
# Do not lowercase links
if url_nc.search(value):
substrings = url_nc.split(value)
for idx, substr in enumerate(substrings):
if not url_nc.match(substr):
substrings[idx] = i18n_lower(substr)
return "".join(substrings)
return i18n_lower(value)
I want to lowercase all text other than text inside the special tags.
def s_lower(value):
spec_nc = re.compile(r"\[spec .*\]") # this is for [spec some raNdoM cAsE text here]
if spec_nc.search(value):
substrings = spec_nc.split(value)
for idx, substr in enumerate(substrings):
if not spec_nc.match(substr):
substrings[idx] = i18n_lower(substr)
return "".join(substrings)
return i18n_lower(value)
Solution
Was writing this as a comment, but it got too long...
You haven't actually said what your problem is, but it looks like you're missing the ()
around the regex (so that the split string ends up in substrings
). It should be
spec_nc = re.compile(r"(\[spec .*\])")
Note:
- you should use
[^]]*
instead of.*
to ensure your match stays within a single set of[]
. - you don't really need to
search
, if the string is not present thensplit
will simply return the original string in a single element list which you can still iterate - you don't need the call to
match
; the strings which match the split regex will always be in the odd indexes of the list so you can just lower case dependent onidx
So you can simplify your code to:
def s_lower(value):
spec_nc = re.compile(r"(\[spec [^]]*\])") # this is for [spec some raNdoM cAsE text here]
substrings = spec_nc.split(value)
for idx, substr in enumerate(substrings):
if idx % 2 == 0:
substrings[idx] = i18n_lower(substr)
return "".join(substrings)
Answered By - Nick
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.