Issue
I have this df:
id description
0 changed status to **In progress** of this task
1 changed status to **Closed** of this task
2 changed status to **Testing** of this task
3 changed status to **Update** of this task
4 changed status to **Completed** of this task
I want to subset this df by extracting the substring from the description column that are between **
such as In Progress
, Closed
, Testing
, Update
and Completed
.
I tried regex by following some stack overflow's solutions such as:
import re
re.search('**(.+?)**', df.description)
But I got this error:
error: nothing to repeat at position 0
Any idea how to solve this?
Solution
Pandas has regex imbedded:
>>> df['description'].str.extract('\*\*(.*)\*\*')
0
0 In progress
1 Closed
2 Testing
3 Update
4 Completed
>>>
Just use the str.extract
method.
Answered By - U12-Forward
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.