Issue
Currently i have a dataframe looking something like this
Session Id | Time | Event | Data |
---|---|---|---|
1 | 10:00 | Btn click | foo |
1 | 11:00 | Identification | bar |
2 | ... | Btn click | foo |
2 | ... | Btn click | foo |
3 | .. | Identification | bar |
I want to group my data by Session Id and process the groups further, but only if they possess an Identification Event.
Currently my solution looks like this:
for session in df.groupby('Session ID'):
session_df: pd.DataFrame = session[1]
if 'Identification' in session_df['Event'].values:
process(session_df)
I tried to use filter on the groupby but got something wrong:
for session in df.groupby('Session ID').filter(lambda s: 'Identification' in s['Event'].values):
process(session[1])
Solution
Code
make codition by groupby
+ transform
cond = df['Event'].eq('Identification').groupby(df['Session Id']).transform(sum).gt(0)
out = df[cond]
out :
Session Id Time Event Data
0 1 10:00 Btn click foo
1 1 11:00 Identification bar
4 3 .. Identification bar
if you want groupby
+ filter
, use following code:
df.groupby('Session Id').filter(lambda x: x['Event'].eq('Identification').sum() > 0)
same result
Example Code
import pandas as pd
data1 = {'Session Id': [1, 1, 2, 2, 3],
'Time': ['10:00', '11:00', '...', '...', '..'],
'Event': ['Btn click', 'Identification', 'Btn click', 'Btn click', 'Identification'],
'Data': ['foo', 'bar', 'foo', 'foo', 'bar']}
df = pd.DataFrame(data1)
Answered By - Panda Kim
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.