Issue
I have a list of users and their activities, like this:
user | date | activity |
---|---|---|
Tom | 1/1/21 | Hop |
Dick | 1/2/21 | Skip |
Harry | 2/2/21 | Jump |
Tom | 1/3/21 | Skip |
Dick | 1/4/21 | Jump |
I want to extract unique user names and the earliest activity date, to get a result like this:
user | first activity |
---|---|
Tom | 1/1/21 |
Dick | 1/2/21 |
Harry | 2/2/21 |
I know I can create an array of unique usernames like this:
unique_users = user_actions[user].unique()
But I don't know how to turn that array of unique usernames into a dataframe with the first action date.
Solution
To get the desired result, you can do the following:
- Convert the
date
column to DateTime format for easy sorting and filtering - Group by
user
and find the minimum date for eachuser
df['date'] = pd.to_datetime(df['date'])
result = (
df.groupby("user")["date"]
.agg(first_activity="min")
.reset_index()
.sort_values("first_activity")
)
print(result)
Output:
user first_activity
2 Tom 2021-01-01
0 Dick 2021-01-02
1 Harry 2021-02-02
Answered By - Kayvan Shah
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.