Issue
How to select the subset of rows where distance is lowest, grouping by date
and p
columns?
df
v p distance date
0 14.6 sst 22454.1 2021-12-30
1 14.9 sst 24454.1 2021-12-30
2 14.8 sst 33687.4 2021-12-30
3 1.67 wvht 23141.8 2021-12-30
4 1.9 wvht 24454.1 2021-12-30
5 1.8 wvht 24454.1 2021-12-30
6 1.7 wvht 23141.4 2021-12-31
7 2.1 wvht 24454.1 2021-12-31
Ideally, the returned dataframe should contain:
df
v p distance date
0 14.6 sst 22454.1 2021-12-30
3 1.67 wvht 23141.8 2021-12-30
6 1.7 wvht 23141.4 2021-12-31
Solution
One way is to use groupby
+ idxmin
to get the index of the smallest distance per group, then use loc
to get the desired output:
out = df.loc[df.groupby(['date', 'p'])['distance'].idxmin()]
Output:
v p distance date
0 14.60 sst 22454.1 2021-12-30
3 1.67 wvht 23141.8 2021-12-30
6 1.70 wvht 23141.4 2021-12-31
Answered By - enke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.