Issue
I want to change the value of a particular column if part of another column is found,
for example, I have the following data frame :
**DATE** **TIME** **VALUE**
20060103 02:01:00 54
20060103 03:02:00 12
20060103 05:03:00 21
20060103 08:05:00 54
20060103 06:06:00 87
20060103 02:07:00 79
20060103 02:08:00 46
I want to change the value in the VALUE column to VALUE of 30, only if the hourly value of the TIME column is equal to 02.
So the desired Data frame would be :
**DATE** **TIME** **VALUE**
20060103 02:01:00 30
20060103 03:02:00 12
20060103 05:03:00 21
20060103 08:05:00 54
20060103 06:06:00 87
20060103 02:07:00 30
20060103 02:08:00 30
Notice how in rows 1 6 and 7 the VALUE changed to 30, because the hour value in the TIME column starts at 02.
I tried to do it the simple way and go over each row and set the value:
import pandas as pd
df = pd.read_csv('file.csv')
for a in df['TIME']:
if a[:2] == '02':
df["VALUE"] = 30
df.to_csv("file.csv", index=False)
But unfortunately this is a file with tens of millions of lines, and this method will take me forever. I would appreciate if anyone has a more creative and effective method .
Thanks !
Solution
Try loc
assignment:
df.loc[pd.to_datetime(df['Time']).dt.hour == 2, 'Value'] = 30
Or:
df.loc[df['Time'].str[:2] == '02', 'Value'] = 30
Answered By - U12-Forward
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.