Issue
I have a dataframe:
mydict = [
{'HH': True, 'LL': False, 'High': 10, 'Low': 1},
{'HH': False, 'LL': True, 'High': 100, 'Low': 20},
{'HH': True, 'LL': False, 'High': 32, 'Low': 1},
{'HH': True, 'LL': False, 'High': 30, 'Low': 1},
{'HH': True, 'LL': False, 'High': 31, 'Low': 1},
{'HH': False, 'LL': True, 'High': 100, 'Low': 40},
{'HH': False, 'LL': True, 'High': 100, 'Low': 45},
{'HH': False, 'LL': True, 'High': 100, 'Low': 42},
{'HH': False, 'LL': True, 'High': 100, 'Low': 44},
{'HH': True, 'LL': False, 'High': 50, 'Low': 1},
]
df = pd.DataFrame(mydict)
print(df)
HH LL High Low
0 True False 10 1
1 False True 100 20
2 True False 32 1
3 True False 30 1
4 True False 31 1
5 False True 100 40
6 False True 100 45
7 False True 100 42
8 False True 100 44
9 True False 50 1
I am trying to find peak values on a chart. So if there are several True
or False
values in either HH
or LL
I want to only leave the one with the highest High
or lowest Low
accordingly. I tried doing it like this:
check = True
while check:
df2 = df[df.HH | df.LL]
h1 = df2.HH & df2.HH.shift()
h2 = df2.High < df2.High.shift()
h3 = df2.HH & df2.HH.shift(-1)
h4 = df2.High < df2.High.shift(-1)
l1 = df2.LL & df2.LL.shift()
l2 = df2.Low > df2.Low.shift()
l3 = df2.LL & df2.LL.shift(-1)
l4 = df2.Low > df2.Low.shift(-1)
df3 = df2[(h1 & h2 | h3 & h4) | (l1 & l2 | l3 & l4)]
df.loc[df.index.isin(df3.index), ["HH", "LL"]] = False
check = not df3.empty
And it seems to procude desirable result on a small example dataframe:
HH LL High Low
0 True False 10 1
1 False True 100 20
2 True False 32 1
3 False False 30 1
4 False False 31 1
5 False True 100 40
6 False False 100 45
7 False False 100 42
8 False False 100 44
9 True False 50 1
But for a reason I am yet to figure out it still leaves occasional repeating peaks in a bigger dataframe.
Solution
Assuming the max High and min Low are in the same row, you can use boolean indexing with groupby.transform
:
group = (df[['HH', 'LL']].ne(df[['HH', 'LL']].shift())
.any(axis=1).cumsum()
)
g = df.groupby(group)
m1 = g['High'].transform('max').eq(df['High'])
m2 = g['Low'].transform('min').eq(df['Low'])
df.loc[~(m1&m2), ['HH', 'LL']] = False
Output:
HH LL High Low
0 True False 10 1
1 False True 100 20
2 True False 32 1
3 False False 30 1
4 False False 31 1
5 False True 100 40
6 False False 100 45
7 False False 100 42
8 False False 100 44
9 True False 50 1
Intermediates:
HH LL High Low group m1 m2 m1&m2
0 True False 10 1 1 True True True
1 False True 100 20 2 True True True
2 True False 32 1 3 True True True
3 True False 30 1 4 False True False
4 True False 31 1 4 True True True
5 False True 100 40 5 True True True
6 False True 100 45 6 True False False
7 False True 100 42 6 True True True
8 False True 100 44 6 True False False
9 True False 50 1 7 True True True
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.