Issue
I have a project in Jupyter notebooks where I am comparing two dataframes. Both are indexed by year, and both have the same columns representing the proportion of followers of a religion in the population. The two dataframes represent two different populations.
I want to be able to display both sets of data in the same line plot, with the same color used for each religion, but with the lines for one population solid, while the lines for the other population are dashed.
I thought I'd be able to do something like this:
ax1.plot(area1_df, color=[col1,col2,col3,col4])
ax1.plot(area2_df, color=[col1,col2,col3,col4], ls=':',alpha=0.5, linewidth=3.0)
But that doesn't work.
At the moment I have this:
import matplotlib.pyplot as plt
fig, ax1 = plt.subplots(1,1, sharex = True, sharey=True, figsize=(15,5))
plt.style.use('seaborn')
ax1.plot(area1_df);
ax1.plot(area2_df, ls=':',alpha=0.5, linewidth=3.0);
ax1.legend(area1_df.keys(), loc=2)
ax1.set_ylabel('% of population')
plt.tight_layout()
Maybe there's another way of doing this entirely(?)
Bonus points for any ideas as to how best to create a unified legend, with entries for the columns from both dataframes.
Solution
To give each line a particular color, you could capture the output of ax1.plot
and iterate through that list of lines. Each line can be given its color. And also a label for the legend.
The following code first generates some toy data and then iterates through the lines of both plots. A legend with two columns is created using the assigned labels.
import numpy as np
import pandas as pd
import matplotlib.pylab as plt
years = np.arange(1990, 2021, 1)
N = years.size
area1_df = pd.DataFrame({f'group {i}': 10+i+np.random.uniform(-1, 1, N).cumsum() for i in range(1, 5)}, index=years)
area2_df = pd.DataFrame({f'group {i}': 10+i+np.random.uniform(-1, 1, N).cumsum() for i in range(1, 5)}, index=years)
fig, ax1 = plt.subplots(figsize=(15,5))
plot1 = ax1.plot(area1_df)
plot2 = ax1.plot(area2_df, ls=':',alpha=0.5, linewidth=3.0)
for l1, l2, label1, label2, color in zip(plot1, plot2, area1_df.columns, area2_df.columns,
['crimson', 'limegreen', 'dodgerblue', 'turquoise']):
l1.set_color(color)
l1.set_label(label1)
l2.set_color(color)
l2.set_label(label2)
ax1.legend(ncol=2, title='area1 / area2')
plt.show()
Alternatively, you could plot via pandas, which does allow assigning a color for each column:
fig, ax1 = plt.subplots(figsize=(15, 5))
colors = plt.cm.Dark2.colors
area1_df.plot(color=colors, ax=ax1)
area2_df.plot(color=colors, ls=':', alpha=0.5, linewidth=3.0, ax=ax1)
ax1.legend(ncol=2, title='area1 / area2')
Answered By - JohanC
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.