Issue
I've a data set that represents rainfall every hour in a day. I'm creating column 'E1' which should start from zero and increment every time column 'value' is greater than zero, and stops when column 'value' becomes zero again, again when column 'value' is zero the numbering must continue.
condition = ((df['value'] > 0) & (df['value'].shift(periods=1) == 0))
df['E2'] = (condition).cumsum()
print(df)
hour value E2
0 0 0.0 0
1 1 0.2 1
2 2 0.2 1
3 3 0.2 1
4 4 0.0 1
5 5 0.2 2
6 6 0.2 2
7 7 0.0 2
8 8 NaN 2
9 9 0.2 2
10 10 0.0 2
11 11 0.0 2
12 12 0.2 3
13 13 0.2 3
14 14 0.0 3
15 15 NaN 3
16 16 0.2 3
17 17 0.0 3
18 18 0.2 4
19 19 0.0 4
20 20 0.2 5
21 21 0.2 5
22 22 NaN 5
23 23 0.0 5
E1 represents the event number, an event can last 1 or several hours, an event should only be considered when the cell before the start of the event is zero and the cell after the last data is equal to zero
I'm stuck, trying to list the events. Should get:
hour value E2
0 0 0.0 0
1 1 0.2 1
2 2 0.2 1
3 3 0.2 1
4 4 0.0 0
5 5 0.2 2
6 6 0.2 2
7 7 0.0 0
8 8 NaN 0
9 9 0.2 0
10 10 0.0 0
11 11 0.0 0
12 12 0.2 3
13 13 0.2 3
14 14 0.0 0
15 15 NaN 0
16 16 0.2 0
17 17 0.0 0
18 18 0.2 4
19 19 0.0 0
20 20 0.2 0
21 21 0.2 0
22 22 NaN 0
23 23 0.0 0
Solution
I find this an odd criteria, but here's how to compute your "event" numbers. Because you're looking both forward and backward, there's no way to do this in a vectorized way.
import numpy as np
import pandas as pd
data = [
0.0,
0.2,
0.2,
0.2,
0.0,
0.2,
0.2,
0.0,
np.nan,
0.2,
0.0,
0.0,
0.2,
0.2,
0.0,
np.nan,
0.2,
0.0,
0.2,
0.0,
0.2,
0.2,
np.nan,
0.0
]
data = [[k] for k in data]
df = pd.DataFrame( data, columns=['data'])
print(df)
nxt = 1
nums = np.zeros(len(df['data']), dtype=int)
start = None
for ndx,v in enumerate(df['data']):
if np.isnan(v):
start = None
elif not v:
if start is not None and start < ndx:
nums[start:ndx] = nxt
nxt += 1
start = ndx+1
df['E1'] = nums
print(df)
Output:
data
0 0.0
1 0.2
2 0.2
3 0.2
4 0.0
5 0.2
6 0.2
7 0.0
8 NaN
9 0.2
10 0.0
11 0.0
12 0.2
13 0.2
14 0.0
15 NaN
16 0.2
17 0.0
18 0.2
19 0.0
20 0.2
21 0.2
22 NaN
23 0.0
data E1
0 0.0 0
1 0.2 1
2 0.2 1
3 0.2 1
4 0.0 0
5 0.2 2
6 0.2 2
7 0.0 0
8 NaN 0
9 0.2 0
10 0.0 0
11 0.0 0
12 0.2 3
13 0.2 3
14 0.0 0
15 NaN 0
16 0.2 0
17 0.0 0
18 0.2 4
19 0.0 0
20 0.2 0
21 0.2 0
22 NaN 0
23 0.0 0
Answered By - Tim Roberts
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.