Issue
I want to assign a number to each group. I tried to do
df['group_n'] = df.groupby('ID').ngroup()
but it gives me an error msg:
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
If i do, df['group_n'] = df.groupby('ID').ngroup().add(1)
I get _n in descending order (meaning C:3, B:2, A:1) is there a way to preserve that order but have group_n start from 0?
My current table:
ID date sender
C Jan20 3
C Feb20 7
C Mar20 12
C Apr20 15
B Mar20 1
B May20 10
B Jun20 15
...
A Jan21 10
A Feb21 12
A Mar21 20
A Apr21 5
desired table:
ID date sender group_n
C Jan20 3 1
C Feb20 7 1
C Mar20 12 1
C Apr20 15 1
B Mar20 1 2
B May20 10 2
B Jun20 15 2
A Jan21 10 3
A Feb21 12 3
A Mar21 20 3
A Apr21 5 3
Thank you in advance!
Solution
Use:
df['group_n'] = pd.factorize(df['ID'])[0] + 1
Or:
df['group_n'] = df.groupby('ID', sort=False).ngroup().add(1)
print(df)
ID date sender group_n
A Jan20 3 1
A Feb20 7 1
A Mar20 12 1
A Apr20 15 1
B Mar20 1 2
B May20 10 2
B Jun20 15 2
C Jan21 10 3
C Feb21 12 3
C Mar21 20 3
C Apr21 5 3
Answered By - ansev
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.