Issue
I am doing some pre processing for a data set on one particular column 'Title' I have already removed numbers and punctuation. But also want to remove measurements as well. The measurements are not in a separate column, they're in the title column.
#Load data set
df = pd.read_csv (r'example')
#df = pd.read_csv (r'example)
# remove numbers and punctuation
df['Title'] = df['Title'].str.replace(r'[^\w\s]+', '')
df['Title'] = df['Title'].str.replace('\d+', '')
print (df['Title'])
Solution
df['Title'] = df['Title'].str.replace(r'\sg$|\skg$|\sml$', '')
as an example. or more generally removing the last word will amount to:
df['Title'] = df['Title'].str.replace(r'\s[a-z]+$', '')
Answered By - Yusuf Ertas
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.