Issue
I have a dataset with two columns "new_date" and "Sales". The dataset captures the daily sales for a company over 3 months in just one year 2020, "i.e., Jan, Feb, and March". The size of the dataset is about 8000 rows. One day might have different transaction or different sales.
new_date Sales
2020-01-26 453
2020-01-26 232
2020-02-03 123
2020-02-03 223
2020-03-13 333
2020-03-23 657
My question is that is it possible to plot the time series for this short period date? And what is the best choice.
I simply tried to use plot
df.plot(legend=False)
But the results is not as good as I was expect.
Is there any better way to visualize and organize this time series data?
Solution
I'm not sure what you are exactly looking for, but based on the data I guess you could sum the Sales
first by new_date
.
df.groupby('new_date').sum().plot(legend=False)
When you want to sum Sales
per week, you can use resample
:
import pandas as pd
import random
df = pd.DataFrame({
'new_date' : pd.date_range('2022-01-01', '2022-03-31'),
'Sales' : random.sample(range(100, 1000), 90)}).set_index('new_date')
df.resample('W').sum().plot(legend=False)
Answered By - Rene
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.