Saturday, January 13, 2024

[FIXED] Groupby column of sequence of numbers

January 13, 2024 pandas, python No comments

Issue

This is my dataframe:

df = pd.DataFrame(
    {
        'a': [101, 9, 1, 10, 100, 103, 102, 105, 90, 110],
        'b': [1, 0, 1, 0, 0, 1, 1, 1, 0, 1],
    }
)

And this is the way that I want to group them by column b:

     a  b
# -------------------------
0  101  1
1    9  0
2    1  1
3   10  0
4  100  0
5  103  1
6  102  1
7  105  1
8   90  0
9  110  1
# -------------------------
2    1  1
3   10  0
4  100  0
5  103  1
6  102  1
7  105  1
8   90  0
9  110  1
# -------------------------
5  103  1
6  102  1
7  105  1
8   90  0
9  110  1
# -------------------------
9  110  1

I need to find the rows in b that 1 is after 0 or 1 is the first value. And then create groups from that row to the end.

This image clarifies the point:

I could identify these rows. But I don't know how to continue. Note that I just need the groups, to apply some functions to them later. I couldn't identify the first row as well:

df.loc[(df.b == 1) & (df.b.shift(1) == 0), 'c'] = 'x'
     a  b    c
0  101  1  NaN
1    9  0  NaN
2    1  1    x
3   10  0  NaN
4  100  0  NaN
5  103  1    x
6  102  1  NaN
7  105  1  NaN
8   90  0  NaN
9  110  1    x

Solution

I think you are looking for:

m = df['b'].eq(1) & df['b'].shift(1).ne(1)
df['c'] = m.astype(int)

Output:

>>> df
     a  b  c
0  101  1  1
1    9  0  0
2    1  1  1
3   10  0  0
4  100  0  0
5  103  1  1
6  102  1  0
7  105  1  0
8   90  0  0
9  110  1  1

>>> df[df['c'].eq(1)].index
Index([0, 2, 5, 9], dtype='int64')

Usage:

# It also works with m[m].index
for g, i in enumerate(df[df['c'].eq(1)].index, 1):
    print(f'[Group {g}]')
    print(df.loc[i:], end='\n\n')

# Output
[Group 1]
     a  b  c
0  101  1  1
1    9  0  0
2    1  1  1
3   10  0  0
4  100  0  0
5  103  1  1
6  102  1  0
7  105  1  0
8   90  0  0
9  110  1  1

[Group 2]
     a  b  c
2    1  1  1
3   10  0  0
4  100  0  0
5  103  1  1
6  102  1  0
7  105  1  0
8   90  0  0
9  110  1  1

[Group 3]
     a  b  c
5  103  1  1
6  102  1  0
7  105  1  0
8   90  0  0
9  110  1  1

[Group 4]
     a  b  c
9  110  1  1

Answered By - Corralien

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, January 13, 2024

[FIXED] Groupby column of sequence of numbers

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels