Issue
I am trying to add a new column in pandas with some text if the data in the 'Level' column is found in a list. It works; however, the last np.where is overwriting the changes made by the first. I am not sure how to get the all the changes to stay. I realize that the second np.where is simply overwriting the changes made by the first, but cannot figure out how to apply both of them. I tried a for loop, an if, and making a copy... newdf = ppl.copy(), but then was not sure how to move the contents into the copy. Digging around I think I might need ppl.loc? Im sure I am missing something simple.
import pandas as pd
import numpy as np
ppl = pd.read_excel('MyXLS.xlsx', sheet_name='People')
list8 = ['8-01','8-02','8-03','8-59']
list9 = ['9-01','9-04','9-36']
LVL8 = "These change Tuesday"
LVL9 = "These change Saturday"
ppl['Weekday'] = np.where(ppl['Level'].isin(list8), LVL8, 'NA')
print(ppl.head(10))
ppl['Weekday'] = np.where(ppl['Level'].isin(list9), LVL9, 'NA')
print(ppl.head(10))
The ouput:
Name Level Weekday
0 Mark 8-01 These change Tuesday
1 Gary 8-02 These change Tuesday
2 Lisa 8-03 These change Tuesday
3 John 8-59 These change Tuesday
4 Jessie 9-01 NA
5 Chris 9-04 NA
6 Sam 9-36 NA
Name Level Weekday
0 Mark 8-01 NA
1 Gary 8-02 NA
2 Lisa 8-03 NA
3 John 8-59 NA
4 Jessie 9-01 These change Saturday
5 Chris 9-04 These change Saturday
6 Sam 9-36 These change Saturday
Solution
You could nest your np.where
calls so that the second doesn't overwrite the changes made by the first:
ppl['Weekday'] = np.where(ppl['Level'].isin(list9), LVL9, np.where(ppl['Level'].isin(list8), LVL8, 'NA'))
But it's simpler to just use .loc
to address the specific rows:
ppl['Weekday'] = 'NA'
ppl.loc[ppl['Level'].isin(list8), 'Weekday'] = LVL8
ppl.loc[ppl['Level'].isin(list9), 'Weekday'] = LVL9
or perhaps np.select
:
ppl['Weekday'] = np.select([ ppl['Level'].isin(list8), ppl['Level'].isin(list9) ], [ LVL8, LVL9 ], 'NA')
Answered By - Nick
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.