Issue
I want to get a column value on a specific day, and when it's not that day, fill rows with float('nan')
for index, row in df.iterrows():
if index == '2000-03-20 00:00:00':
df['event'] = row['close']
else:
df['event'] = float('nan')
The df['event']
return nan
even on the specified day 2000-03-20 00:00:00
df.loc['2000-03-20 00:00:00', 'event']
nan
Solution
Vectorized solutions are e.g. with Series.where
- if need omit times use DatetimeIndex.normalize
and compare with Timestamp
:
rng = pd.date_range('2000-03-19', periods=10, freq='9H')
df = pd.DataFrame({'close': range(10)}, index=rng)
df['event'] = df['close'].where(df.index.normalize() == pd.Timestamp('2000-03-20'))
print (df)
close event
2000-03-19 00:00:00 0 NaN
2000-03-19 09:00:00 1 NaN
2000-03-19 18:00:00 2 NaN
2000-03-20 03:00:00 3 3.0
2000-03-20 12:00:00 4 4.0
2000-03-20 21:00:00 5 5.0
2000-03-21 06:00:00 6 NaN
2000-03-21 15:00:00 7 NaN
2000-03-22 00:00:00 8 NaN
2000-03-22 09:00:00 9 NaN
Or use partial string indexing
:
df.loc['2000-03-20', 'event'] = df['close']
print (df)
close event
2000-03-19 00:00:00 0 NaN
2000-03-19 09:00:00 1 NaN
2000-03-19 18:00:00 2 NaN
2000-03-20 03:00:00 3 3.0
2000-03-20 12:00:00 4 4.0
2000-03-20 21:00:00 5 5.0
2000-03-21 06:00:00 6 NaN
2000-03-21 15:00:00 7 NaN
2000-03-22 00:00:00 8 NaN
2000-03-22 09:00:00 9 NaN
rng = pd.date_range('2000-03-19', periods=10)
df = pd.DataFrame({'close': range(10)}, index=rng)
df['event'] = df['close'].where(df.index == '2000-03-20 00:00:00')
print (df)
close event
2000-03-19 0 NaN
2000-03-20 1 1.0
2000-03-21 2 NaN
2000-03-22 3 NaN
2000-03-23 4 NaN
2000-03-24 5 NaN
2000-03-25 6 NaN
2000-03-26 7 NaN
2000-03-27 8 NaN
2000-03-28 9 NaN
Reason why not working your solution is in loop overwrite column event
, because assign not by index. But is not recommneded, because slow. For learning here is changed your solution:
for index, row in df.iterrows():
if index.normalize() == pd.Timestamp('2000-03-20 00:00:00'):
df.loc[index, 'event'] = row['close']
else:
df.loc[index, 'event'] = float('nan')
print (df)
close event
2000-03-19 00:00:00 0 NaN
2000-03-19 09:00:00 1 NaN
2000-03-19 18:00:00 2 NaN
2000-03-20 03:00:00 3 3.0
2000-03-20 12:00:00 4 4.0
2000-03-20 21:00:00 5 5.0
2000-03-21 06:00:00 6 NaN
2000-03-21 15:00:00 7 NaN
2000-03-22 00:00:00 8 NaN
2000-03-22 09:00:00 9 NaN
for index, row in df.iterrows():
if index == pd.Timestamp('2000-03-20 00:00:00'):
df.loc[index, 'event'] = row['close']
else:
df.loc[index, 'event'] = float('nan')
print (df)
close event
2000-03-19 0 NaN
2000-03-20 1 1.0
2000-03-21 2 NaN
2000-03-22 3 NaN
2000-03-23 4 NaN
2000-03-24 5 NaN
2000-03-25 6 NaN
2000-03-26 7 NaN
2000-03-27 8 NaN
2000-03-28 9 NaN
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.