Saturday, February 5, 2022

[FIXED] Filtering rows that have unique value in a column using pandas

February 05, 2022 pandas, python No comments

Issue

I have a df:

id      value
1       10
2       15
1       10
1       10
2       13
3       10
3       20

I am trying to keep only rows that have 1 unique value in column value so that the result df looks like this:

id      value
1       10
1       10
1       10

I dropped id = 2, 3 because it has more than 1 unique value in column value, 15, 13 & 10, 20 respectively.

I read this answer. But this simply removes duplicates whereas I want to check if a given column - in this case column value has more than 1 unique value.

I tried:

df['uniques'] = pd.Series(df.groupby('id')['value'].nunique())

But this returns nan for every row since I am trying to fit n returns on n+m rows after grouping. I can write a function and apply it to every row but I was wondering if there is a smart quick filter that achieves my goal.

Solution

Use transform with groupby to align the group values to the individual rows:

df['nuniques'] = df.groupby('id')['value'].transform('nunique')

Output:

   id  value  nuniques
0   1     10         1
1   2     15         2
2   1     10         1
3   1     10         1
4   2     13         2
5   3     10         2
6   3     20         2

If you only need to filter your data, you don't need to assign the new column:

df[df.groupby('id')['value'].transform('nunique') == 1]

Answered By - Quang Hoang

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, February 5, 2022

[FIXED] Filtering rows that have unique value in a column using pandas

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels