Issue
The graph is fixed now but I am having troubles plotting the legend. It only shows legend for 1 of the plots. As seen in the picture below
I am trying to plot a double axis graph with twinx but I am facing some difficulties as seen in the picture below.
Any input is welcomed! If you require any additional information, I am happy to provide them to you.
as compared to the original before plotting z-axis.
I am unsure why my graph is like that as initially before plotting my secondary y axis, (the pink line), the closing value graph can be seen perfectly but now it seems cut.
It may be due to my data as provided below.
Code I have currently:
# read csv into variable
sg_df_merged = pd.read_csv("testing1.csv", parse_dates=[0], index_col=0)
# define figure
fig = plt.figure()
fig, ax5 = plt.subplots()
ax6 = ax5.twinx()
x = sg_df_merged.index
y = sg_df_merged["Adj Close"]
z = sg_df_merged["Singapore"]
curve1 = ax5.plot(x, y, label="Singapore", color = "c")
curve2 = ax6.plot(x, z, label = "Face Mask Compliance", color = "m")
curves = [curve1, curve2]
# labels for my axis
ax5.set_xlabel("Year")
ax5.set_ylabel("Adjusted Closing Value ($)")
ax6.set_ylabel("% compliance to wearing face mask")
ax5.grid #not sure what this line does actually
# set x-axis values to 45 degree angle
for label in ax5.xaxis.get_ticklabels():
label.set_rotation(45)
ax5.grid(True, color = "k", linestyle = "-", linewidth = 0.3)
plt.gca().legend(loc='center left', bbox_to_anchor=(1.1, 0.5), title = "Country Index")
plt.show();
Initially, I thought it was due to my excel having entire blank lines, but I have since removed those rows. The sample data is in this question.
Also, I have tried to interpolate but somehow it doesn't work.
Solution
- Only rows that where all
NaN
, were dropped. There’s still a lot of rows withNaN
. - In order for
matplotlib
to draw connecting lines between two data points, the points must be consecutive. - The plot API isn't connecting the data between the
NaN
values - This can be dealt with by converting the
pandas.Series
to aDataFrame
, and using.dropna
. - See that
x
has been dropped, because it will not match the index length ofy
orz
. They are shorter after.dropna
. y
is now a separate dataframe, where.dropna
is used.z
is also a separate dataframe, where.dropna
is used.- The
x-axis
for the plot are the respective indices. - Tested in
python v3.12.0
,pandas v2.1.2
,matplotlib v3.8.1
.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# read csv into variable
sg_df_merged = pd.read_csv("test.csv", parse_dates=[0], index_col=0)
# define figure
fig, ax5 = plt.subplots(figsize=(8, 6))
ax6 = ax5.twinx()
# select specific columns to plot and drop additional NaN
y = pd.DataFrame(sg_df_merged["Adj Close"]).dropna()
z = pd.DataFrame(sg_df_merged["Singapore"]).dropna()
# add plots with markers
curve1 = ax5.plot(y.index, 'Adj Close', data=y, label="Singapore", color = "c", marker='o')
curve2 = ax6.plot(z.index, 'Singapore', data=z, label = "Face Mask Compliance", color = "m", marker='o')
# labels for my axis
ax5.set_xlabel("Year")
ax5.set_ylabel("Adjusted Closing Value ($)")
ax6.set_ylabel("% compliance to wearing face mask")
# rotate xticks
ax5.xaxis.set_tick_params(rotation=45)
# add a grid to ax5
ax5.grid(True, color = "k", linestyle = "-", linewidth = 0.3)
# create a legend for both axes
curves = curve1 + curve2
labels = [l.get_label() for l in curves]
ax5.legend(curves, labels, loc='center left', bbox_to_anchor=(1.1, 0.5), title = "Country Index")
plt.show()
- This can be implemented more succinctly as follows.
- Tick labels on a pandas datetime axis are not aligned with the ticks outlines three easy ways to center align the
xticklabels
.
# given a datetime[ns] dtype index, if the time components are all 0, extracting only the date will cause the xticklabels to be centered under the tick
df.index = df.index.date
ax = df['Adj Close'].dropna().plot(marker='.', color='c', grid=True, figsize=(12, 6),
title='My Plot', ylabel='Adj Close', xlabel='Date', legend='Adj Close')
ax_right = df['Singapore'].dropna().plot(marker='.', color='m', secondary_y=True, legend='Singapore', rot=0, ax=ax)
ax_right.set_ylabel('Singapore')
ax.legend(title='Country Index', bbox_to_anchor=(1.06, 0.5), loc='center left', frameon=False)
ax_right.legend(bbox_to_anchor=(1.06, 0.43), loc='center left', frameon=False)
ax.xaxis.set_major_locator(mdates.MonthLocator(bymonth=(1, 7)))
ax.xaxis.set_minor_locator(mdates.MonthLocator())
Data
- Copy the data to the clipboard and read with the following line.
df = pd.read_clipboard(sep=',', index_col=[0], parse_dates=[0])
,Adj Close,Singapore
2015-10-01,2998.350098,
2015-11-01,2855.939941,
2015-12-01,2882.72998,
2016-01-01,2629.110107,
2016-02-01,2666.51001,
2016-03-01,2840.899902,
2016-04-01,2838.52002,
2016-05-01,2791.060059,
2016-06-01,2840.929932,
2016-07-01,2868.689941,
2016-08-01,2820.590088,
2016-09-01,2869.469971,
2016-10-01,2813.8701170000004,
2016-11-01,2905.169922,
2016-12-01,2880.76001,
2017-01-01,3046.800049,
2017-02-01,3096.610107,
2017-03-01,3175.110107,
2017-04-01,3175.439941,
2017-05-01,3210.820068,
2017-06-01,3226.47998,
2017-07-01,3329.52002,
2017-08-01,3277.26001,
2017-09-01,3219.909912,
2017-10-01,3374.080078,
2017-11-01,3433.540039,
2017-12-01,3402.919922,
2018-01-01,3533.98999,
2018-02-01,3517.939941,
2018-03-01,3427.969971,
2018-04-01,3613.929932,
2018-05-01,3428.179932,
2018-06-01,3268.699951,
2018-07-01,3319.850098,
2018-08-01,3213.47998,
2018-09-01,3257.050049,
2018-10-01,3018.800049,
2018-11-01,3117.610107,
2018-12-01,3068.76001,
2019-01-01,3190.169922,
2019-02-01,3212.689941,
2019-03-01,3212.879883,
2019-04-01,3400.199951,
2019-05-01,3117.76001,
2019-06-01,3321.610107,
2019-07-01,3300.75,
2019-08-01,3106.52002,
2019-09-01,3119.98999,
2019-10-01,3229.879883,
2019-11-01,3193.919922,
2019-12-01,3222.830078,
2020-01-01,3153.72998,
2020-02-01,3011.080078,
2020-02-21,,24.0
2020-02-25,,
2020-02-28,,22.0
2020-03-01,2481.22998,
2020-03-02,,
2020-03-03,,
2020-03-06,,23.0
2020-03-10,,
2020-03-13,,21.0
2020-03-17,,
2020-03-20,,24.0
2020-03-23,,
2020-03-24,,
2020-03-27,,27.0
2020-03-30,,
2020-03-31,,
2020-04-01,2624.22998,
2020-04-03,,37.0
2020-04-06,,
2020-04-07,,
2020-04-10,,73.0
2020-04-13,,
2020-04-14,,
2020-04-17,,85.0
2020-04-20,,
2020-04-21,,
2020-04-24,,90.0
2020-04-27,,
2020-04-28,,
2020-05-01,2510.75,90.0
2020-05-05,,
2020-05-15,,
2020-05-21,,
2020-05-22,,92.0
2020-05-25,,
2020-05-26,,
2020-05-30,,
2020-06-01,2589.909912,
2020-06-05,,89.0
2020-06-08,,
2020-06-15,,
2020-06-16,,
2020-06-19,,92.0
2020-06-22,,
2020-06-25,,
2020-07-01,2529.820068,
2020-07-03,,
2020-07-06,,
2020-07-07,,90.0
2020-07-12,,
2020-07-14,,
2020-07-20,,92.0
2020-07-26,,
2020-07-27,,
2020-07-31,,
2020-08-01,2532.51001,
2020-08-03,,88.0
2020-08-07,,
2020-08-10,,
2020-08-12,,
2020-08-14,,90.0
2020-08-17,,
2020-08-25,,
2020-08-28,,90.0
2020-08-31,,
2020-09-01,2490.090088,
2020-09-11,2490.090088,
Answered By - Trenton McKinney
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.