Issue
I have a dataframe as follow:
Index | Value | Condition1 | Condition2 |
---|---|---|---|
1 | 1 | True | False |
2 | 5 | False | False |
3 | 3 | False | True |
4 | 3 | False | False |
5 | 3 | True | False |
6 | 6 | False | True |
I wish to check when the row with condition 2 is true. Populate the new column with the last previous row with condition 1 to be true. Example of desired output:
Index | Value | Condition1 | Condition2 | New Column |
---|---|---|---|---|
1 | 1 | True | False | None |
2 | 5 | False | False | None |
3 | 3 | False | True | 1 |
4 | 3 | False | False | None |
5 | 3 | True | False | None |
6 | 6 | False | True | 3 |
I tried using tail() with np.where():
df["New Column"] = np.where(df["Condition 2"]==True,df[df["Condition 1"]==True].["Value"].tail(1),None)
However, this gave a syntax error. Since the data frame size may get bigger, I hope to achieve this without using a loop method as well.
Appreciate any help and advice!
Solution
You can use masks and where
combined with ffill
:
df['New Column'] = (df['Value'].where(df['Condition1']).ffill()
.where(df['Condition2'])
)
Variant with boolean indexing:
df.loc[df['Condition2'], 'New Column'] = df['Value'].where(df['Condition1']).ffill()
Output:
Index Value Condition1 Condition2 New Column
0 1 1 True False NaN
1 2 5 False False NaN
2 3 3 False True 1.0
3 4 3 False False NaN
4 5 3 True False NaN
5 6 6 False True 3.0
Intermediates:
Index Value Condition1 Condition2 where (condition1) ffill where (condition2)
0 1 1 True False 1.0 1.0 NaN
1 2 5 False False NaN 1.0 NaN
2 3 3 False True NaN 1.0 1.0
3 4 3 False False NaN 1.0 NaN
4 5 3 True False 3.0 3.0 NaN
5 6 6 False True NaN 3.0 3.0
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.