Issue
I have a dataframe and the data sample is shown as below.
I am trying to shade the area around the timeseries plot. I tried with fill_between
function but it doesn't worked.
I tried:
# load the file
df = pd.read_csv(r"C:\Users\sam\data.csv", usecols=['Hour','Forecast'],header=0)
X1=df.forecast
mu = X1.mean
sigma = X1.std
timestep=df.Hour
# ss=mu1+sigma1
# kk=mu1-sigma1
plt.fill_between(timestep, mu, sigma, alpha=0.2) #this is the shaded error
sample_data.csv
Hour Forecast
1 0.428732899
2 0.501308875
3 0.491805242
4 0.392900424
5 0.442624008
6 0.411723392
7 0.397455466
8 0.400126642
9 0.444411425
10 0.423408925
11 0.759687642
12 2.166908125
13 2.153370175
14 2.053740002
15 2.095005501
16 2.153214908
17 2.210168766
18 2.122148284
19 1.9024695
20 2.255718026
21 2.258879807
22 0.480089583
23 1.551103332
24 1.512505375
Expected Output:
Solution
The shaded area around the lines represent the 95% confidence interval. In order to have this area you should have more than one observation for each time point, so it is possible to compute a standard deviation and a CI for each time point. But in data you provided there are only one observation for each time point.
You can do draw an similar plot, by computing the standard deviation and sum and subtract it from the columns you want to plot. Pay attention! This is not the confidence interval (for which you need more observation), it's an interval 2 times standard deviation wide around mean value for each time point. Moreover it keeps constant width along time axis.
I honestly doubt this is a useful plot to use, since area width it's constant during time and area width it's 2 times the standard deviation, which is computed along time. In short: you shouldn't use this plot and provide more observations for each time point in order to compute a proper confidence interval.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv(r"data/data.csv", usecols = ['Hour', 'Forecast'], header = 0)
X1 = df.Forecast
mu = X1.mean()
sigma = X1.std()
timestep = df.Hour
X1_plus_sigma = X1 + sigma
X1_minus_sigma = X1 - sigma
plt.plot(timestep, X1, color = 'blue')
plt.fill_between(timestep, X1_plus_sigma, X1_minus_sigma, alpha = 0.2, color = 'blue')
plt.show()
Answered By - Zephyr
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.