Issue
I have pd.DataFrame
in which one column contains lists
as the values. I want to create another column which consist only the most common value from that column.
Example dataframe:
col_1
0 [1, 2, 3, 3]
1 [2, 2, 8, 8, 7]
2 [3, 4]
And the expected dataframe is
col_1 col_2
0 [1, 2, 3, 3] [3]
1 [2, 2, 8, 8, 7] [2, 8]
2 [3, 4] [3, 4]
I tried to do
from statistics import mode
df['col_1'].apply(lambda x: mode(x))
But it is showing the most common list in that column.
I also tried to use pandas mode
function directly on that column, it also did not help. Is there any way to find the most common value(s)?
Solution
Using mode
per group:
df['col_2'] = (df['col_1']
.explode()
.groupby(level=0)
.apply(lambda x: x.mode().tolist())
)
output:
col_1 col_2
0 [1, 2, 3, 3] [3]
1 [2, 2, 8, 8, 7] [2, 8]
2 [3, 4] [3, 4]
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.