Issue
Is there an easy way to sort a DataFrame based on a linear combination of two columns without creating a new column for that value? Given
df = pd.DataFrame([[4,1],[2,3]], columns=list('AB'))
A | B | |
---|---|---|
0 | 4 | 1 |
1 | 2 | 3 |
I would want to sort df
by a given linear combination of columns A
and B
(e.g. A*B
). Calling sort_values
with a key function does not work, because it applies the function to each column individually.
Ideally, I would do something like:
df.sort_values(by=['A','B'], key=lambda a,b: a*b) # does not work
Right now I am creating an extra column sort
like this and I am wondering whether that is necessary.
df['sort'] = df['A']*df['B']
df.sort_values(['sort'])
Thanks in advance.
Solution
Use DataFrame.sort_index
with multiplied Series
and .get
:
df1 = df.sort_index(key=(df.A*df.B).get)
Or Series.argsort
with DataFrame.iloc
:
df1 = df.iloc[(df.A*df.B).argsort()]
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.