Monday, June 6, 2022

[FIXED] Different colors in matpotlib bar plot

June 06, 2022 bar-chart, matplotlib, nan, pandas, python No comments

Issue

I calculated NaN value percentage of a dataframe and then plotted it. I want each variable to have a unique color. The code I used works well but every 9th variable color is same as 1st variable color, and the cycle repeats. See the pic:

The code:

per = df.isna().mean().round(4) * 100
f, ax = plt.subplots(figsize=(25, 12), dpi = 200)
i = 0
for key, value in zip(per.keys(), per.values):
    if (value > 0):
        ax.bar(key, value, label=key)
        ax.text(i, value + 0.5, str(np.round(value, 2)), ha='center')
        i = i + 1
ax.set_xticklabels([]) 
ax.set_xticks([]) 
plt.title('NaN Value percentage in the dataset')
plt.ylim(0,115)
plt.ylabel('Percentage')
plt.xlabel('Columns')
plt.legend(loc='upper left')
plt.show()

I tried the following line of code, but it picked only first color:

my_colors = list(islice(cycle(['b', 'r', 'g', 'y', 'c', 'm', 
                              'tan', 'grey', 'pink', 'chocolate', 'gold']), None, len(df)))

f, ax = plt.subplots(figsize=(25, 12), dpi = 200)
i = 0
for key, value in zip(per.keys(), per.values):
    if (value > 0):
        ax.bar(key, value, label=key, color = my_colors)
        ax.text(i, value + 0.5, str(np.round(value, 2)), ha='center')
        i = i + 1
ax.set_xticklabels([]) 
ax.set_xticks([]) 
plt.title('NaN Value percentage in the dataset')
plt.ylim(0,115)
plt.ylabel('Percentage')
plt.xlabel('Columns')
plt.legend(loc='upper left')
plt.show()

The result:

Any help is appreciated.

See the data here.

Solution

I think there are two problems with your second code:

my_colors = list(islice(cycle(['b', 'r', 'g', 'y', 'c', 'm', 
                              'tan', 'grey', 'pink', 'chocolate', 'gold']), None, len(df)))

Here len(df) gets you the number of rows, but you actually want a list that is equal to the number of per.keys(). So: len(per.keys()). Next, you need to use your variable i to iterate over your list of colors.

ax.bar(key, value, label=key, color = my_colors)

Here, I think you need to use my_colors[i].

Incidentally, using matplotlib.cm.get_cmap on matplotlib's Colormaps is great to get you a list of unique colors from a palette quickly. Try something like this:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import random
import string

# build df with some random NaNs
data = np.random.uniform(low=0, high=10, size=(5,20))
mask = np.random.choice([1, 0], data.shape, p=[.4, .6]).astype(bool)
data[mask] = np.nan

df = pd.DataFrame(data, columns=list(string.ascii_lowercase)[:20])

per = df.isna().mean().round(4) * 100

length = len(per.keys())
cmap = cm.get_cmap('plasma', length)

lst = [*range(length)]
random.shuffle(lst)

f, ax = plt.subplots(figsize=(25, 12), dpi = 200)
i = 0
for key, value in zip(per.keys(), per.values):
    if (value > 0):
        ax.bar(key, value, label=key, color = cmap(lst[i])[:3])
        ax.text(i, value + 0.5, str(np.round(value, 2)), ha='center')
        i = i + 1
ax.set_xticklabels([]) 
ax.set_xticks([]) 
plt.title('NaN Value percentage in the dataset')
plt.ylim(0,115)
plt.ylabel('Percentage')
plt.xlabel('Columns')
plt.legend(loc='upper left')
plt.show()

Output:

Or non-random (comment out random.shuffle(lst)):

Answered By - ouroboros1

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Monday, June 6, 2022

[FIXED] Different colors in matpotlib bar plot

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels