Issue
I have an excel sheet and I want to extract different values from different columns into a single columns.
I want to figure out first of all how to deal with subheaders like astro
and athens grey
as well as to extract information in this patterns. Thanks
I have managed to resolve the sub header issue , Now i just want help with regex to extract information in desired format. Here is what I have done so far ,Subheaders
Solution
See if it helps:
import pandas as pd
data = pd.read_excel('Sample.xlsx')
data[data.isna().sum(axis=1)==6]
data = data.dropna(how='all')
import numpy as np
data['SKU'].astype(str).str.extract('([^\(\)]*)')[0].str.strip().replace('\d+', np.nan, regex = True).fillna(method='ffill')+' '+data['DESCRIPTION']+' '+data['SIZE'].str.extract('([^0-9x]+)').fillna('')[0]
Output:
Answered By - keramat
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.