Issue
I have two columns in my dataframe that I want to group by and assign ids to.
df = pd.DataFrame({'A' : [1, 2, 3, 4,
3, 4],
'B' : [1, 2, 3, 4,
5, 4]})
A B
1 1
2 2
3 3
4 4
3 5
4 4
grouped = df.groupby(['A','B'])
returns
A B
1 1
2 2
3 3
5
4 4
I am trying to assign a unique id to each grouping.
def idx(x):
return str(uuid.uuid4())
grouped.agg(lambda x: idx(x))
which returns a pandas series
A B
1 1 ab6ac10e-7dbc-43a4-9f93-cc0c83ec2d03
2 2 c26548ec-9002-4ad5-bad9-c84f8c594c9b
3 3 8daab68b-51aa-42b3-8546-3b64ee73f460
5 cb8f7da1-81de-4bed-8ae9-790c64ac66e2
4 4 b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe
dtype: object
what I am trying to do is write this series of unique ids back into the original dataframe. I expect something like this:
A B idx
1 1 ab6ac10e-7dbc-43a4-9f93-cc0c83ec2d03
2 2 c26548ec-9002-4ad5-bad9-c84f8c594c9b
3 3 8daab68b-51aa-42b3-8546-3b64ee73f460
4 4 b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe
3 5 cb8f7da1-81de-4bed-8ae9-790c64ac66e2
4 4 b742a9e0-ba08-42f2-b9e8-13cf6c3b0dbe
Solution
Check your output with reindex
df['new'] = grouped.agg(lambda x: idx(x)).reindex(pd.MultiIndex.from_frame(df)).values
Answered By - BENY
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.