Issue
I am using Jupiter notebook and I want to create a filter. I have an array which contains sensor reedings. The sensor dat are available every seconds. I want to implement a filter such as:
store the last two readings of the SenorData in an empty array (FilterData[]), if the last value (FilterData[-1] ) is bigger than a cutoff (500), then use the previous value (FilterData[-2]) and update this filtered value to the SenorData and repeat this loop again with the new sensor reading.
Here is an example how my sensor data looks like:
SenorData= [110.3244118,
110.3244118,
110.3244118,
110.3244118,
110.3244118,
110.3244118,
110.3244118,
170.7510605,
170.7510605,
2180.280465,
2417.867061,
2793.702506,
2883.198035,
2822.542497,
2822.542497,
2862.483596,
2862.483596,
2723.776971,
2694.70809,
2812.700278,
2812.700278,
2812.700278,
2596.400342,
2269.66155,
1902.867214,
833.9564196,
457.4343089,
160.3366192,
131.0388501,
131.0388501,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079,
91.09775079]
Solution
IIUC you search for something like this. You us pd.mask
on your data with the threshold of 500 which results in a Series, where all values greater than 500 are NaN
now. I'm not sure if you want ffill()
or bfill()
. ffill will forward fill the missing values, bfill will backward fill.
EDIT2 (last one!!! ;) )
I can't get it done with a vectorized pandas method. Since we have to conditionally compare the adjacent value step by step while the adjacent value can change, I think we need a for loop here.
def filter_data(data):
prev = data[0]
masked_list = [prev]
for i, val in enumerate(data[1:], 1):
if val >= 3 * prev:
masked_list.append(prev)
else:
prev = val
masked_list.append(val)
return masked_list
result = filter_data(SenorData)
#plt.plot(SenorData, marker='o')
plt.plot(result, marker='o')
The list result has the exact same data as the solution of the 1st edit. Please try it and tell me if that is what you are searching for.
EDIT
Apparently you wanted the solution with ffill()
Here is the changed solution:
data = pd.Series(SenorData)
filtered_data = data.mask(data.gt(500)).ffill()
print(filtered_data)
plt.plot(filtered_data, marker='o')
plt.grid()
filtered data plotted:
filtered data:
0 110.324412
1 110.324412
2 110.324412
3 110.324412
4 110.324412
5 110.324412
6 110.324412
7 170.751060
8 170.751060
9 170.751060
10 170.751060
11 170.751060
12 170.751060
13 170.751060
14 170.751060
15 170.751060
16 170.751060
17 170.751060
18 170.751060
19 170.751060
20 170.751060
21 170.751060
22 170.751060
23 170.751060
24 170.751060
25 170.751060
26 457.434309
27 160.336619
28 131.038850
29 131.038850
30 91.097751
31 91.097751
32 91.097751
33 91.097751
34 91.097751
35 91.097751
36 91.097751
37 91.097751
38 91.097751
39 91.097751
OLD
same with bfill()
:
import pandas as pd
data = pd.Series(SenorData)
filtered_data = data.mask(data.gt(500)).bfill()
filtered_data plotted:
filtered_data:
0 110.324412
1 110.324412
2 110.324412
3 110.324412
4 110.324412
5 110.324412
6 110.324412
7 170.751060
8 170.751060
9 457.434309
10 457.434309
11 457.434309
12 457.434309
13 457.434309
14 457.434309
15 457.434309
16 457.434309
17 457.434309
18 457.434309
19 457.434309
20 457.434309
21 457.434309
22 457.434309
23 457.434309
24 457.434309
25 457.434309
26 457.434309
27 160.336619
28 131.038850
29 131.038850
30 91.097751
31 91.097751
32 91.097751
33 91.097751
34 91.097751
35 91.097751
36 91.097751
37 91.097751
38 91.097751
39 91.097751
dtype: float64
Answered By - Rabinzel
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.