Issue
I have data in the following form :
pkgSpot state id_data_x id_data_y pkgSpot_y delay_mean delay_max delay_min
0 1 Free 7.245899e+08 7.245899e+08 1.0 0.334572 292.0 -161.0
1 1 Occupied 7.245876e+08 7.245876e+08 1.0 2.865248 116.0 -162.0
2 2 Free 7.245884e+08 7.245884e+08 2.0 0.122951 294.0 -84.0
3 2 Occupied 7.245885e+08 7.245885e+08 2.0 1.344130 257.0 -279.0
4 3 Free 7.245909e+08 7.245909e+08 3.0 -2.931159 261.0 -196.0
5 3 Occupied 7.245894e+08 7.245894e+08 3.0 1.975265 246.0 -273.0
6 4 Free 7.245753e+08 7.245753e+08 4.0 0.889908 222.0 -235.0
7 4 Occupied 7.245729e+08 7.245729e+08 4.0 1.483180 180.0 -117.0
8 17 Free 7.245742e+08 7.245742e+08 17.0 -10.535714 160.0 -236.0
9 17 Occupied 7.245744e+08 7.245744e+08 17.0 7.473988 294.0 -258.0
10 18 Free 7.246035e+08 7.246036e+08 18.0 -9.374269 104.0 -160.0
11 18 Occupied 7.246025e+08 7.246025e+08 18.0 8.403315 88.0 -100.0
12 19 Free 7.245642e+08 7.245642e+08 19.0 -4.568548 220.0 -271.0
13 19 Occupied 7.245633e+08 7.245633e+08 19.0 4.474790 253.0 -262.0
14 26 Free 7.245383e+08 7.245383e+08 26.0 -0.480363 280.0 -300.0
15 26 Occupied 7.245365e+08 7.245366e+08 26.0 -10.149856 263.0 -298.0
16 27 Free 7.245861e+08 7.245861e+08 27.0 -3.831683 300.0 -258.0
17 27 Occupied 7.245864e+08 7.245864e+08 27.0 1.077670 300.0 -299.0
18 28 Free 7.245878e+08 7.245878e+08 28.0 -8.868201 221.0 -300.0
19 28 Occupied 7.245891e+08 7.245891e+08 28.0 6.633684 241.0 -220.0
and I would like to have, in one figure, a graph showing the mean, max and min delay discretized per pkgSpot
and per state
Is there an easy way to achieve that with either pandas, seaborn or matplotlib ? I have played a little bit with the three libraries and with the pandas melt
function but I could not find a way to do that 'easily'.
Thanks for your support
Solution
Here is an approach using filled areas to show the minimum, mean and maximum:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from io import StringIO
data_str = '''pkgSpot state delay_mean delay_max delay_min
1 Free 0.334572 292.0 -161.0
1 Occupied 2.865248 116.0 -162.0
2 Free 0.122951 294.0 -84.0
2 Occupied 1.344130 257.0 -279.0
3 Free -2.931159 261.0 -196.0
3 Occupied 1.975265 246.0 -273.0
4 Free 0.889908 222.0 -235.0
4 Occupied 1.483180 180.0 -117.0
17 Free -10.535714 160.0 -236.0
17 Occupied 7.473988 294.0 -258.0
18 Free -9.374269 104.0 -160.0
18 Occupied 8.403315 88.0 -100.0
19 Free -4.568548 220.0 -271.0
19 Occupied 4.474790 253.0 -262.0
26 Free -0.480363 280.0 -300.0
26 Occupied -10.149856 263.0 -298.0
27 Free -3.831683 300.0 -258.0
27 Occupied 1.077670 300.0 -299.0
28 Free -8.868201 221.0 -300.0
28 Occupied 6.633684 241.0 -220.0'''
df = pd.read_csv(StringIO(data_str), delim_whitespace=True)
fig, ax = plt.subplots(figsize=(12, 4))
for state, color in zip(['Free', 'Occupied'], ['dodgerblue', 'crimson']):
df_state = df[df['state'] == state]
x = df_state['pkgSpot'].astype(str)
ax.plot(x, df_state['delay_mean'], color=color)
ax.fill_between(x, df_state['delay_min'], df_state['delay_max'], color=color, alpha=0.4, label=state)
ax.set_xlabel('pkgSpot')
ax.set_ylabel('delay (min, mean, max)')
ax.margins(x=0.02)
ax.legend(ncol=2, loc='lower center', bbox_to_anchor=[0.5, 1.01])
plt.tight_layout()
plt.show()
Another option uses errorbars:
fig, ax = plt.subplots(figsize=(12, 4))
for state, color, dodge in zip(['Free', 'Occupied'], ['dodgerblue', 'crimson'], [-0.2, 0.2]):
df_state = df[df['state'] == state]
x = np.arange(len(df_state)) + dodge
yerr = [df_state['delay_mean'] - df_state['delay_min'], df_state['delay_max'] - df_state['delay_mean']]
ax.errorbar(x, df_state['delay_mean'], yerr=yerr, color=color, ls=':', lw=2, capsize=10, capthick=2, label=state)
ax.set_xticks(np.arange(len(df_state)))
ax.set_xticklabels(df_state['pkgSpot'].astype(str))
ax.set_xlabel('pkgSpot')
ax.set_ylabel('delay (min, mean, max)')
ax.legend(ncol=2, loc='lower center', bbox_to_anchor=[0.5, 1.01])
plt.tight_layout()
plt.show()
Answered By - JohanC
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.