Issue
I have a df with the customers of my company on jupyter notebook, they are answering a research that could be answered multiple times and the datatime is registered. I would like to select the latest answer of each customer and agroup this on a new data frame.
I tried to use:
df_1 = df[df['Submit Date'] == df['Submit Date'].max()].copy()
but .max () just select the latest date, therefore df_1 just got 1 value. I am new at this area, sorry if there are some begginer level errors.
Solution
You need to sort ascending by answer date, and then drop the duplicates by the customer who answered. If multiple values are encountered, you keep the last customer which assures you that you have latest answer. The code looks as follows:
df.sort_values('Submit Date').drop_duplicates(subset=['customer'], keep=’last’)
Answered By - Alexandru Placinta
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.