Issue
Given a Pandas DataFrame that has multiple columns with categorical values (0 or 1), is it possible to conveniently get the value_counts for every column at the same time?
For example, suppose I generate a DataFrame as follows:
import numpy as np
import pandas as pd
np.random.seed(0)
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
I can get a DataFrame like this:
a b c d
0 0 1 1 0
1 1 1 1 1
2 1 1 1 0
3 0 1 0 0
4 0 0 0 1
5 0 1 1 0
6 0 1 1 1
7 1 0 1 0
8 1 0 1 1
9 0 1 1 0
How do I conveniently get the value counts for every column and obtain the following conveniently?
a b c d
0 6 3 2 6
1 4 7 8 4
My current solution is:
pieces = []
for col in df.columns:
tmp_series = df[col].value_counts()
tmp_series.name = col
pieces.append(tmp_series)
df_value_counts = pd.concat(pieces, axis=1)
But there must be a simpler way, like stacking, pivoting, or groupby?
Solution
Just call apply
and pass pd.Series.value_counts
:
In [212]:
df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd'))
df.apply(pd.Series.value_counts)
Out[212]:
a b c d
0 4 6 4 3
1 6 4 6 7
Answered By - EdChum
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.