Issue
This is my dataframe:
df = pd.DataFrame(
{
'a': [101, 9, 1, 10, 100, 103, 102, 105, 90, 110],
'b': [1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
}
)
And this is the way that I want to group them by column b
:
a b
# -------------------------
0 101 1
1 9 0
2 1 1
3 10 0
4 100 0
5 103 1
6 102 1
7 105 1
8 90 0
9 110 1
# -------------------------
2 1 1
3 10 0
4 100 0
5 103 1
6 102 1
7 105 1
8 90 0
9 110 1
# -------------------------
5 103 1
6 102 1
7 105 1
8 90 0
9 110 1
# -------------------------
9 110 1
I need to find the rows in b
that 1 is after 0 or 1 is the first value. And then create groups from that row to the end.
This image clarifies the point:
I could identify these rows. But I don't know how to continue. Note that I just need the groups, to apply some functions to them later. I couldn't identify the first row as well:
df.loc[(df.b == 1) & (df.b.shift(1) == 0), 'c'] = 'x'
a b c
0 101 1 NaN
1 9 0 NaN
2 1 1 x
3 10 0 NaN
4 100 0 NaN
5 103 1 x
6 102 1 NaN
7 105 1 NaN
8 90 0 NaN
9 110 1 x
Solution
I think you are looking for:
m = df['b'].eq(1) & df['b'].shift(1).ne(1)
df['c'] = m.astype(int)
Output:
>>> df
a b c
0 101 1 1
1 9 0 0
2 1 1 1
3 10 0 0
4 100 0 0
5 103 1 1
6 102 1 0
7 105 1 0
8 90 0 0
9 110 1 1
>>> df[df['c'].eq(1)].index
Index([0, 2, 5, 9], dtype='int64')
Usage:
# It also works with m[m].index
for g, i in enumerate(df[df['c'].eq(1)].index, 1):
print(f'[Group {g}]')
print(df.loc[i:], end='\n\n')
# Output
[Group 1]
a b c
0 101 1 1
1 9 0 0
2 1 1 1
3 10 0 0
4 100 0 0
5 103 1 1
6 102 1 0
7 105 1 0
8 90 0 0
9 110 1 1
[Group 2]
a b c
2 1 1 1
3 10 0 0
4 100 0 0
5 103 1 1
6 102 1 0
7 105 1 0
8 90 0 0
9 110 1 1
[Group 3]
a b c
5 103 1 1
6 102 1 0
7 105 1 0
8 90 0 0
9 110 1 1
[Group 4]
a b c
9 110 1 1
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.