Issue
Suppose I have the dataframe (df1),
ID | customer_only | city | state |
---|---|---|---|
1 | N | 0 | 0 |
1 | Y | 0 | 0 |
1 | 0 | A | B |
2 | N | 0 | 0 |
2 | Y | 0 | 0 |
2 | 0 | C | D |
2 | 0 | E | F |
3 | N | 0 | 0 |
3 | Y | 0 | 0 |
3 | 0 | G | H |
The expected output is as follows.
ID | customer_only | city | state |
---|---|---|---|
1 | N | A | B |
1 | Y | A | B |
2 | N | C | D |
2 | N | E | F |
2 | Y | C | D |
2 | Y | E | F |
3 | N | G | H |
3 | Y | G | H |
Here is my code.
Fill state and city columns with
np.NaN
data = { "ID": [1,1,1,2,2,2,2,3,3,3], "customer_only": ['N','Y',0,'N','Y',0,0,'N','Y',0], "city": [0,0,'A',0,0,'C','E',0,0,'G'], "state": [0,0,'B',0,0,'D','F',0,0,'H']} df1 = pd.DataFrame(data) df1[['city', 'state']] = df1[['city', 'state']].replace(0, np.NaN)
I used
bfill()
to fill the previous NaN(s) with the values of city and state in the row with customer_only=0.df1[['city', 'state']] = df1[['city', 'state']].bfill()
However, for ID=2,
bfill()
does not give the expected output. I also usedgroupby()
, but it cannot reshape the dimensions of ID=2 rows.
Any suggestions?
Solution
You can slice the 0/non-0 rows separately and merge
:
# identify non-0 rows (use the correct type)
m = df['customer_only'].ne(0)
# pick the non-0 rows, merge with the 0s on ID
out = (df.loc[m, ['ID', 'customer_only']]
.merge(df.loc[~m, ['ID', 'city', 'state']], on='ID')
)
Output:
ID customer_only city state
0 1 N A B
1 1 Y A B
2 2 N C D
3 2 N E F
4 2 Y C D
5 2 Y E F
6 3 N G H
7 3 Y G H
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.