Issue
I want to apply the .nunique() function to a full dataFrame.
On the following screenshot, we can see that it contains 130 features. Screenshot of shape and columns of the dataframe. The goal is to get the number of different values per feature. I use the following code (that worked on another dataFrame).
def nbDifferentValues(data):
total = data.nunique()
total = total.sort_values(ascending=False)
percent = (total/data.shape[0]*100)
return pd.concat([total, percent], axis=1, keys=['Total','Pourcentage'])
diffValues = nbDifferentValues(dataFrame)
And the code fails at the first line and I get the following error which I don't know how to solve ("unhashable type : 'list'", 'occured at index columns'): Trace of the error
Solution
You probably have a column whose content are lists.
Since lists in Python are mutable they are unhashable.
import pandas as pd
df = pd.DataFrame([
(0, [1,2]),
(1, [2,3])
])
# raises "unhashable type : 'list'" error
df.nunique()
SOLUTION: Don't use mutable structures (like lists) in your dataframe:
df = pd.DataFrame([
(0, (1,2)),
(1, (2,3))
])
df.nunique()
# 0 2
# 1 2
# dtype: int64
Answered By - raul ferreira
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.