Sunday, November 28, 2021

[FIXED] shading the timeseries plot in python

November 28, 2021 data-visualization, matplotlib, plot, python, seaborn No comments

Issue

I have a dataframe and the data sample is shown as below.

I am trying to shade the area around the timeseries plot. I tried with fill_between function but it doesn't worked.

I tried:

# load the file
df = pd.read_csv(r"C:\Users\sam\data.csv", usecols=['Hour','Forecast'],header=0)

X1=df.forecast
mu = X1.mean
sigma = X1.std

timestep=df.Hour

# ss=mu1+sigma1
# kk=mu1-sigma1
 
plt.fill_between(timestep, mu, sigma, alpha=0.2) #this is the shaded error

`sample_data.csv`

Hour Forecast
1   0.428732899
2   0.501308875
3   0.491805242
4   0.392900424
5   0.442624008
6   0.411723392
7   0.397455466
8   0.400126642
9   0.444411425
10  0.423408925
11  0.759687642
12  2.166908125
13  2.153370175
14  2.053740002
15  2.095005501
16  2.153214908
17  2.210168766
18  2.122148284
19  1.9024695
20  2.255718026
21  2.258879807
22  0.480089583
23  1.551103332
24  1.512505375

Expected Output:

Solution

The shaded area around the lines represent the 95% confidence interval. In order to have this area you should have more than one observation for each time point, so it is possible to compute a standard deviation and a CI for each time point. But in data you provided there are only one observation for each time point.
You can do draw an similar plot, by computing the standard deviation and sum and subtract it from the columns you want to plot. Pay attention! This is not the confidence interval (for which you need more observation), it's an interval 2 times standard deviation wide around mean value for each time point. Moreover it keeps constant width along time axis.
I honestly doubt this is a useful plot to use, since area width it's constant during time and area width it's 2 times the standard deviation, which is computed along time. In short: you shouldn't use this plot and provide more observations for each time point in order to compute a proper confidence interval.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(r"data/data.csv", usecols = ['Hour', 'Forecast'], header = 0)

X1 = df.Forecast
mu = X1.mean()
sigma = X1.std()

timestep = df.Hour

X1_plus_sigma = X1 + sigma
X1_minus_sigma = X1 - sigma

plt.plot(timestep, X1, color = 'blue')
plt.fill_between(timestep, X1_plus_sigma, X1_minus_sigma, alpha = 0.2, color = 'blue')

plt.show()

Answered By - Zephyr

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Sunday, November 28, 2021

[FIXED] shading the timeseries plot in python

Issue

`sample_data.csv`

Solution

0 comments:

Post a Comment

Popular Posts

Labels