Issue
i have a following dataframe:
index | Value |
---|---|
1 | None |
2 | A |
3 | None |
4 | A |
5 | B |
6 | B |
7 | None |
8 | A |
9 | A |
10 | B |
The idea is to fill None in between A and B such that there is no consecutive A or B in the column.
Desired output
index | Value |
---|---|
1 | None |
2 | A |
3 | None |
4 | None |
5 | B |
6 | None |
7 | None |
8 | A |
9 | None |
10 | B |
This can be done easily with a loop but since I am using pandas and numpy, I hope to avoid the loop approach.
Solution
You can ffill
to propagate the non-None forward, then shift
, only keep the values that are not identical to this new series with boolean indexing:
df.loc[df['Value'].eq(df['Value'].ffill().shift()), 'Value'] = None
Or with mask
:
df['Value'] = df['Value'].mask(df['Value'].eq(df['Value'].ffill().shift()), None)
Output:
index Value
0 1 None
1 2 A
2 3 None
3 4 None
4 5 B
5 6 None
6 7 None
7 8 A
8 9 None
9 10 B
Intermediates:
index Value ffill shift eq
0 1 None None None True
1 2 A A None False
2 3 None A A False
3 4 A A A True
4 5 B B A False
5 6 B B B True
6 7 None B B False
7 8 A A B False
8 9 A A A True
9 10 B B A False
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.