Issue
I have data that looks something like this:
order_id Product_A Product_B Price_Product_A Price_Product_B
100 Pen Notebook 1.5 3
101 Bag Watch 10 12
I need it to look like this:
order_id product price
100 Pen 1.5
100 Notebook 3
101 Bag 10
101 Watch 12
How can I use stack() and unstack() for this? I used it so fat only for one numeric measure.
Solution
There is a convenient function for this, wide_to_long
:
pd.wide_to_long(df, ['Product','Price_Product'], i='order_id', j='subtype', sep = '_', suffix = '\D+')
output:
Product Price_Product
order_id subtype
100 A Pen 1.5
101 A Bag 10.0
100 B Notebook 3.0
101 B Watch 12.0
The same can be achieved with melt
and unstack
, which could be of instructional value. A somewhat tricky bit is to split 'variable'
into two parts, the root and the suffix, something wide_to_long
helps you with. For your example case this can look like this:
df1 = df.melt(id_vars = 'order_id')
df1['cat'] = df1['variable'].str[:-2] # you may have to tweak this for your actual data
df1['subtype'] = df1['variable'].str[-1:] # you may have to tweak this for your actual data
(df1.drop(columns = 'variable')
.set_index(['order_id','subtype','cat'])
.unstack()
.droplevel(level=0, axis=1)
.reset_index()
)
Answered By - piterbarg
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.