Issue
Say I have a pandas data frame called df_all
.
I then wish to group by columns Foo
, and select only the last row for each grouped by set:
# E.g.
# Foo Bar Baz
# 1 1 1
# 1 2 2
# 2 1 2
# 2 3 4
# 2 5 6
# Wish to select rows '1 2 3' and '2 5 6' since if we group by Foo,
# they are the last fow for each distinct Foo value
df_slice = df_all.groupby('Foo').last()
The above works, now I wish to have the set of rows that are in df_all, and not in df_slice
, this is what I tried:
dv_inverse = df[~df_slice.isin(df_all)].dropna(how = 'all')
Solution
What about:
df_inverse = df_all[df_all.duplicated(subset='Foo', keep='last')]
print(df_inverse)
Foo Bar Baz
0 1 1 1
2 2 1 2
3 2 3 4
Answered By - user2246849
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.