Issue
I have dataframe like this
I want to remove the parentheses in the Movie_Name column.
Here is a few example
df_Movie["Movie Name"].str.findall(r'\([^()]*\)').sum()
['(500)',
'(Eksiteu)',
'(Mumbai Diaries)',
'(Geukhanjikeob)',
'(Erkekler Ne İster?)',
'(The Witch)',
'(Ji Hun)']
And then ı tried this solution.
import re
df_Movie["Movie Name"] = df_Movie["Movie Name"].str.replace(r'\([^()]*\)', ' ', regex = True)
Here is the output of the solution for one example.
df_Movie.iloc[394, :]
Movie Name Days of Summer
Year 2009
IMDB 7.7
Name: 394, dtype: object
In this case, the values between the parentheses are completely lost. I don't want that.
I want output like this : (500) Days of Summer --> 500 Days of Summer
How can ı solve this ?
Solution
You can remove all parentheses from a dataframe column using
df_Movie["Movie Name"] = df_Movie["Movie Name"].str.replace(r'[()]+', '', regex=True)
The [()]+
regex pattern matches one or more (
or )
chars.
See the regex demo.
Answered By - Wiktor Stribiżew
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.