Issue
I want to shift the rows of a df column with the shifting period based on another df colums with all the shifting variables:
value | shiftperiod | shiftedvalue
row1 a 0 a
row2 b 0 b
row3 c 1 b
row4 d 3 a
row5 e 4 a
row6 f 2 d
row7 g 1 f
Initially I thought this could be done as:
df['shiftedvalue']=df['value'].shift(df['shiftperiod'])
However, obviously the shift function does not take 'df' as the input type for period.
In this case, how should I efficiently write a code to achieve this dynamic shifting based on another df column?
Solution
Use numpy indexing for that:
# convert the "value" to numpy array
a = df['value'].to_numpy()
# generate a 0->n range and subtract the shiftperiod
idx = np.arange(len(df))-df['shiftperiod'].to_numpy()
# reindex, ensuring only valid indexes are used
df['shiftedvalue'] = np.where((idx>=0) & (idx<len(df)),
a[np.clip(idx, 0, len(df)-1)],
np.nan)
If you are sure that the shiftperiod
can never yield a non existing position, you can simplify the last line to:
df['shiftedvalue'] = a[idx]
Or using a Series with a range index and reindex
with conversion to_numpy
:
s = df['value'].reset_index(drop=True)
df['shiftedvalue'] = s.reindex(s.index-df['shiftperiod']).to_numpy()
Output:
value shiftperiod shiftedvalue
row1 a 0 a
row2 b 0 b
row3 c 1 b
row4 d 3 a
row5 e 4 a
row6 f 2 d
row7 g 1 f
Example with invalid shiftperiod
values:
value shiftperiod shiftedvalue
row1 a 0 a
row2 b 0 b
row3 c 1 b
row4 d 3 a
row5 e 5 NaN
row6 f 2 d
row7 g -1 NaN
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.