Issue
I am able to get the rolling minima of a column using:
df['MemberLows'] = df['Members'].rolling(5, closed="left").min()
Is there a way of returning values that are within a tolerance of the minima without resorting to a non-vectorized approach like a for loop?
For example, given the table:
Month | Members |
---|---|
Jan | 10 |
Feb | 10 |
Mar | 10 |
Apr | 6 |
May | 5 |
Jun | 10 |
Jul | 10 |
Aug | 10 |
Sep | 10 |
Oct | 7 |
Nov | 8 |
Dec | 10 |
The rolling minima would be 5 and 7. Say I would like to also return values within 20% of the minima; this would also return 6 and 8.
Ultimately I want to return two arrays. One containing the indices of the minima/those within 20% and one containing the actual values:
indices_arr = [3 4 9 10]
values_arr = [6 5 7 8]
Solution
You are almost there. Just calculate the relevant threshold for each value and compare. Like so:
>>> import pandas as pd
>>> df = pd.read_csv("table.tsv", delimiter="\t")
>>> result = df[df["Members"] <= (df['Members'].rolling(5, center=True, closed="left").min() * 1.2)]
>>> print(result)
Month Members
3 Apr 6
4 May 5
9 Oct 7
10 Nov 8
Two things worth noting:
- I'm assuming that values are positive (otherwise I can't just multiply by 1.2)
- I added center=True to get min value indices to correspond to indices of members.
Answered By - VRehnberg
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.