Issue
I have a stacked bar chart showing the count of different comorbidities grouped by sex.
Female is represented by 'pink'
, Male by 'lightskyblue'
.
To make it more obvious that 'None'
is not just another comorbidity, I want 'None'
(patient had no comorbidity) to use a different set of colors to make it stand out: 'lightcoral'
for Female and 'royalblue'
for Male.
This is my current stacked bar chart using colors = ['lightskyblue', 'pink']
:
comorbidity_counts_by_gender = df.groupby('sex_female')[comorbidity_columns].sum()
comorbidity_counts_by_gender = comorbidity_counts_by_gender[comorbidity_counts_by_gender.sum().sort_values(ascending=True).index].T
colors = ['lightskyblue', 'pink']
bars = comorbidity_counts_by_gender.plot(kind='barh', stacked=True, figsize=(8, 8), color=colors, width=0.8)
plt.title('Distribution of Comorbidities by Gender')
plt.xlabel('Count')
plt.ylabel('')
bars.legend(title='Gender', loc="upper left", bbox_to_anchor=(1, 1), frameon=False)
plt.show()
No matter what I try, I can't seem to provide a different pair of colors to the 'None' bar. Here are a 2 ways I tried to solve the issue that didn't work out for me:
Try 1:
colors = [['royalblue', 'lightcoral'] if comorbidity == 'None' else ['lightskyblue', 'pink'] for comorbidity in comorbidity_counts_per_gender.index]
This results in ValueError: Invalid color ['lightskyblue', 'pink']
Try 2:
colors = []
for comorbidity in comorbidity_counts_by_gender.index:
if comorbidity == 'None':
colors = ['royalblue', 'lightcoral']
else:
colors = ['lightskyblue', 'pink']
This always uses ['lightskyblue', 'pink'] for any column.
Solution
Here's an example that changes the colours of the appropriate bars directly:
import pandas as pd
from matplotlib import pyplot as plt
def setcolors(ax, name="None", colors=["royalblue", "lightcoral"]):
"""
Function to set the colours for the bars for a given category name.
"""
# get labels
ytl = ax.get_yticklabels()
numlabels = len(ytl)
# find the index of the given named label
for i, t in enumerate(ytl):
if t.get_text() == name:
break
# get the matplotlib rectangle objects representing the bars
# (note this relies on nothing else having been added to the plot)
rects = ax.get_children()[0:2 * numlabels]
nrects = [rects[i], rects[numlabels + i]]
# loop over the two bars for the given named label and change the colours
for rect, color in zip(nrects, colors):
rect.set_color(color)
rect.set_edgecolor("none")
# some mock data
df = pd.DataFrame(
{
"Male": [5, 1, 3, 1],
"Female": [4, 2, 2, 0]
},
index=["Smoking", "Hypertension", "None", "Hyperthyroidism"],
)
bars = df.plot(kind="barh", stacked=True, color=["lightskyblue", "pink"])
# change the colors
setcolors(bars)
plt.show()
Note that, by default (I think) the Rectangle
objects representing the bars should be the first things in the list returned by get_children()
. But, if you add further things to the plot then this may not be the case.
Answered By - Matt Pitkin
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.