Issue
I'm really struggling with this one. I tried to search from left to right, but still can't figure this out.
I have a list of strings with random amount of tags, each placed in brackets, randomly positioned within each string. Few examples may look as follows.
[tag1][tag4] Desired string - with optional dash [tag10]
[tag1][tag2][tag3] Desired string [tag10]
[tag3][tag1][tag2][tag5] Desired - string (with suffix)
[tag2][tag5][tag4] [Animation] Target string [tag10]
[tag3][tag1][tag5][tag10][Animations](prefix)Desired - string (and suffix)
What I'm trying to achieve is to extract from each string the content without tags, which are enclosed in brackets. The only exception is tag [Animation] or [Animations]. In case, one of these tags appear, I want to extract them as well together with the desired string.
So in case of list above, the desired output would be following. (I don't care about the whitespace around extracted strings, it will be trimmed afterwards.)
Desired string - with optional dash
Desired string
Desired - string (with suffix)
[Animation] Target string
[Animations](prefix)Desired - string (and suffix)
Originally, I was using as simple regex as \[.*?\]
. Which matched all tags in brackets, and I simply replaced everything with empty string.
re_pattern = r"\[.*?\]"
re.sub(re_pattern, '', dirty_string).strip()
However, now I found a need to have an exception for tags [Animation] and [Animations], and really can't figure it out. Your help would be much appreciated. Thanks.
Solution
Match the first bracket only if it isn't followed by the tags you don't want:
import re
lines = '''\
[tag1][tag4] Desired string - with optional dash [tag10]
[tag1][tag2][tag3] Desired string [tag10]
[tag3][tag1][tag2][tag5] Desired - string (with suffix)
[tag2][tag5][tag4] [Animation] Target string [tag10]
[tag3][tag1][tag5][tag10][Animations](prefix)Desired - string (and suffix)'''.splitlines()
for line in lines:
print(re.sub(r'\[(?!Animations?\]).*?\]', '', line).strip())
Output:
Desired string - with optional dash
Desired string
Desired - string (with suffix)
[Animation] Target string
[Animations](prefix)Desired - string (and suffix)
Answered By - Mark Tolonen
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.