Issue
This should be very easy, but I'm having several issues. The thing is, I want to do something like this post, but (1) I have a datetime field, so I have the hour, minutes and seconds in my date column, (2) I want to plot a line graph by day.
So, this is my data:
date col1 col2
2020-01-01 00:01:020 20 500
2020-01-02 00:01:020 10 500
2020-01-02 00:01:000 20 500
2020-01-02 00:01:021 20 500
2020-02-05 20:11:010 30 500
2020-02-05 10:01:020 10 500
.
.
.
So, as I mentioned above, what I want is to plot the daily average of col1. I started with this:
df.groupby('date')['col1'].mean()
That didn't work because of the hours, minutes and seconds. Later, I tried this:
df["day"] = df["date"].dt.day
df.groupby("day")["col1"].mean().plot(kind="line")
I almost did it, but the column day is not actually the day, but a number which represents the position of the day in the year, I guess. So any ideas on how to make this plot?
Solution
IIUC:
groupby date instead of day:
df.groupby(df['date'].dt.date)["col1"].mean().plot(kind="line",rot=25)
#you don't need to create a column date for this directly pass date in groupby()
OR
df.groupby(df['date'].dt.normalize())["col1"].mean().plot(kind="line",rot=25)
Optional(you can also do this by these 2 but the above 2 fits best for your data and condition since the below ones will create unnecessary dates and NaN's):
#via pd.Grouper():
df.groupby(pd.Grouper(key='date',freq='1D'))["col1"].mean().dropna().plot(kind="line")
#OR
#via dt.floor():
df.groupby(df['date'].dt.floor('1D'))["col1"].mean().dropna().plot(kind="line")
output(for given sample data):
Answered By - Anurag Dabas
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.