Issue
My data looks something like this:
I am looking to get the unique values of col1 (which could have duplicates) and their corresponding max value in col3. I also need the col2 value of the row which has that max value.
I referred to this solution but it's not quite giving me what I am looking for. Any help on this is appreciated. Thanks!
Solution
This could be done by find the max values and return new dataframe and then merge it with the first dataframe.
# initialize list of lists
data = [['a1','b1', 5], ['a1','b2', 6], ['c1', 'd1',3],['c1','d2', 4],['c1','d3', 1]]
# Create the pandas DataFrame
df1 = pd.DataFrame(data, columns=['col1','col2', 'col3'])
# Create dataframe from the max values
df2 = pd.DataFrame(df.groupby(['col1'])['col3'].max()).reset_index()
# Merge and return new dataframe
df1.merge(df2['col3'])
Answered By - adwib
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.