Issue
I am trying to use the re module in python to split a string that represents a list. The list is identified by brackets.
Input:
"[1]first[2]second[3]third" ... etc
Desired output:
['first', 'second', 'third',...]
My current code is as follows:
out = re.split('\[(.*?)\]', thelist)
It returns the following, but how do I get the desired?
['', '1', 'first', '2', "second", '3', 'third',...]
Solution
You can use a regex to match numbers enclosed with [...]
and get rid of the empty elements with:
import re
p = re.compile(r'\[\d+\]')
test_str = "[1]first[2]second[3]third"
print([x for x in p.split(test_str) if x])
# => ['first', 'second', 'third']
See IDEONE demo
To get the output with the numbers in Python 3 you can use
import re
test_str = "[1]first[2]second[3]third"
print( re.split(r'(?!^)(?=\[\d+])', test_str) )
See this Python 3 demo.
Your code returned the captured texts since re.split
returns all captures as separate elements in the resulting array.
If there are capturing groups in the separator and it matches at the start of the string, the result will start with an empty string.
Also, to get rid of just the first empty element, you may use
res = p.split(test_str)
if not res[0]:
del res[0]
Answered By - Wiktor Stribiżew
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.