Issue
I'd like to melt a DataFrame without using any loops. Suppose I have a DataFrame looking something like this:
df = pd.DataFrame({'var1': [1,2,3,4,5]*100,
'var2': [1,2,3,4,5]*100,
'col1': ['a','b']*250,
'col2': ['c','d']*250,})
var1 var2 col1 col2
0 1 1 a c
1 2 2 b d
2 3 3 a c
3 4 4 b d
4 5 5 a c
.. ... ... ... ...
495 1 1 b d
496 2 2 a c
497 3 3 b d
498 4 4 a c
499 5 5 b d
And now I want to melt the data:
df.melt(value_vars=['var1', 'var2'], var_name='var', id_vars=['col1', 'col2'])
col1 col2 var value
0 a c var1 1
1 b d var1 2
2 a c var1 3
3 b d var1 4
4 a c var1 5
.. ... ... ... ...
995 b d var2 1
996 a c var2 2
997 b d var2 3
998 a c var2 4
999 b d var2 5
Is it possible to melt the data without using any loops to one column with id_vars? So it would end up looking something like this:
col var value
0 a var1 1
1 b var1 2
2 a var1 3
3 b var1 4
4 a var1 5
5 c var2 1
6 d var2 2
7 c var2 3
8 d var2 4
9 c var2 5
.. .. .... ..
Solution
Use wide_to_long
, it create new column by number after var, col
columns, so if necessary added var
strings:
df1 = (pd.wide_to_long(df.reset_index(), stubnames=['var','col'], i='index', j='new')
.reset_index(level=1)
.assign(new = lambda x: 'var' + x['new'].astype(str))
.reset_index(drop=True)
)
print (df1)
new var col
0 var1 1 a
1 var1 2 b
2 var1 3 a
3 var1 4 b
4 var1 5 a
.. ... ... ..
995 var2 1 d
996 var2 2 c
997 var2 3 d
998 var2 4 c
999 var2 5 d
[1000 rows x 3 columns]
Possible a bit hack solution with melt
:
df1 = df.melt(value_vars=['var1', 'var2'], var_name='var', id_vars=['col1', 'col2'])
df2 = df.melt(value_vars=['col1', 'col2'], var_name='var', id_vars=['var1', 'var2'])
df = pd.concat([df1[['var','value']], df2['value'].rename('col')], axis=1)
print (df)
var value col
0 var1 1 a
1 var1 2 b
2 var1 3 a
3 var1 4 b
4 var1 5 a
.. ... ... ..
995 var2 1 d
996 var2 2 c
997 var2 3 d
998 var2 4 c
999 var2 5 d
[1000 rows x 3 columns]
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.