Issue
I have this dataframe :
df = pd.DataFrame({'CLASS': ['A', 'B', 'A'],
'MEMBERS': ['foo & bar', 'bar & luz', 'baz']})
print(df)
# CLASS MEMBERS
# 0 A foo & bar
# 1 B bar & luz
# 2 A baz
First, I want to group on the column CLASS
and combine the unique values of the column MEMBERS
. And secondly, I need the unique combinations to be in a specific order : ['foo', 'bar', 'baz', 'luz']
.
I was able to do the first one :
df.groupby('CLASS')['MEMBERS'].agg(lambda s: " & ".join(set(' & '.join(s).split(' & '))))
# CLASS
# A foo & baz & bar
# B luz & bar
# Name: MEMBERS, dtype: object
Can you guys show me how to achieve the ordering ?
My expected output is this :
# CLASS
# A foo & bar & baz
# B bar & luz
# Name: MEMBERS, dtype: object
Solution
You can use sorted
with a custom dictionary:
order = ['foo', 'bar', 'baz', 'luz']
mapper = {k: i for i,k in enumerate(order)}
# {'foo': 0, 'bar': 1, 'baz': 2, 'luz': 3}
out = (df.groupby('CLASS')['MEMBERS']
.agg(lambda s: " & ".join(sorted(set(' & '.join(s).split(' & ')),
key=mapper.get)))
)
Output:
CLASS
A foo & bar & baz
B bar & luz
Name: MEMBERS, dtype: object
Alternative with a function and itertools.chain
:
from itertools import chain
def cust_join(s, order):
mapper = {k: i for i,k in enumerate(order)}
return ' & '.join(sorted(set(chain.from_iterable(x.split(' & ') for x in s)),
key=mapper.get
))
out = (df.groupby('CLASS')['MEMBERS']
.agg(cust_join, order=['foo', 'bar', 'baz', 'luz'])
)
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.