Tuesday, October 25, 2022

[FIXED] Pandas : Assigning values (1,0) to grouped data

October 25, 2022 dataframe, pandas, python-3.x No comments

Issue

I have a dataframe df like below :


import pandas as pd

data = {'A': ['ABCD_1', 'ABCD_1', 'ABCD_1', 'ABCD_1', 'PQRS_2', 'PQRS_2', 'PQRS_2', 'PQRS_2', 'PQRS_2', 'PQRS_3', 'PQRS_4'], 'B':[2, 3, 5, 6, 7, 8, 9, 11, 13, 15, 17]}
df = pd.DataFrame(data)
df


         |     A      |    B     |
         +------------+----------+
         |    ABCD_1  |    2     |
         |    ABCD_1  |    3     |
         |    ABCD_1  |    5     |
         |    ABCD_1  |    6     |
         |    PQRS_2  |    7     |
         |    PQRS_2  |    8     |
         |    PQRS_2  |    9     |
         |    PQRS_2  |    11    |
         |    PQRS_2  |    13    |
         |    PQRS_3  |    15    |
         |    PQRS_4  |    17    |
         +------------+----------+

What I want to achieve is that I need to assign 1 to first value of every group of column A and 0 to the rest of other values. Incase, there is a single group present in Column A I just assign 1 to that group. So, my expected result should look like below.

Expected Output :

         |     A      |    B     |    P     |
         +------------+----------+----------+
         |    ABCD_1  |    2     |     1    |
         |    ABCD_1  |    3     |     0    |
         |    ABCD_1  |    5     |     0    | 
         |    ABCD_1  |    6     |     0    |
         |    PQRS_2  |    7     |     1    |
         |    PQRS_2  |    8     |     0    |
         |    PQRS_2  |    9     |     0    |
         |    PQRS_2  |    11    |     0    |
         |    PQRS_2  |    13    |     0    |
         |    PQRS_3  |    15    |     1    |
         |    PQRS_4  |    17    |     1    |
         +------------+----------+----------+

I tried to group them using the below code, however, I wasn't able to get expected results.

Actual Output

df['P'] = pd.factorize(df['A']) + 1

         |     A      |    B     |    P     |
         +------------+----------+----------+
         |    ABCD_1  |    2     |     1    |
         |    ABCD_1  |    3     |     1    |
         |    ABCD_1  |    5     |     1    | 
         |    ABCD_1  |    6     |     1    |
         |    PQRS_2  |    7     |     1    |
         |    PQRS_2  |    8     |     2    |
         |    PQRS_2  |    9     |     2    |
         |    PQRS_2  |    11    |     2    |
         |    PQRS_2  |    13    |     2    |
         |    PQRS_3  |    15    |     3    |
         |    PQRS_4  |    17    |     4    |
         +------------+----------+----------+

How can I assign 1 and 0 to each of the groups using Pandas ?

Solution

You could achieve this as follows:

Use df.duplicated to get a boolean series with False for each first group entry and True for the others.
Next, use the unary operator (~) to switch True and False values, and chain Series.astype to turn the booleans into ones (True) and zeros (False).

df['P'] = (~df.A.duplicated()).astype(int)

print(df)

         A   B  P
0   ABCD_1   2  1
1   ABCD_1   3  0
2   ABCD_1   5  0
3   ABCD_1   6  0
4   PQRS_2   7  1
5   PQRS_2   8  0
6   PQRS_2   9  0
7   PQRS_2  11  0
8   PQRS_2  13  0
9   PQRS_3  15  1
10  PQRS_4  17  1

Alternatively, but provided that the groups in column A are sorted, you can also try as follows:

Compare each row with the next one (see: Series.shift) with Series.ne. Each row that is unequal to its shift will give us True, all the others will lead to False. Afterwards, again chain astype.

df['P'] = (df.A.ne(df.A.shift())).astype(int)

print(df)

         A   B  P
0   ABCD_1   2  1
1   ABCD_1   3  0
2   ABCD_1   5  0
3   ABCD_1   6  0
4   PQRS_2   7  1
5   PQRS_2   8  0
6   PQRS_2   9  0
7   PQRS_2  11  0
8   PQRS_2  13  0
9   PQRS_3  15  1
10  PQRS_4  17  1

Answered By - ouroboros1

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, October 25, 2022

[FIXED] Pandas : Assigning values (1,0) to grouped data

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels