Issue
Here is what I am trying to do. I want to substitute the values of this data frame.
For example. Bernard to be substituted as 1, and then Drake as 2 and so on and so forth. How to iterate through the column to write a function that can do the following.
Solution
The function already exists - pd.factorize
.
It returns a tuple - first a new column with the values each item has been mapped to. Then second an index of the unique values.
df = pd.DataFrame({'name': ['Bernard', 'Bernard', 'Drake', 'Drake', 'Lance']})
pd.factorize(df.name)
(array([0, 0, 1, 1, 2]), Index(['Bernard', 'Drake', 'Lance'], dtype='object'))
Using that, we'd just assign a new column:
df = df.assign(codes=pd.factorize(df.name)[0] + 1)
df
name codes
0 Bernard 1
1 Bernard 1
2 Drake 2
3 Drake 2
4 Lance 3
Answered By - creanion
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.