Issue
I'm relatively new to Matplotlib and still struggling with certain aspects.
I'm attempting to apply custom gradient colours on my plot that fill the bar's colour based on its location within min and max range.
I've been able to generate LinearSegmentedColormap and using zip function created a sorted array, however, when I try to apply it on color=
or color=my_cmap(colors_array)
it gives me a value error.
I was wondering if there is something wrong with my practice or, if possible, there is another way around the following problem?
Code:
import math
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
np.random.seed(12345)
df = pd.DataFrame([np.random.normal(32000, 200000, 3650),
np.random.normal(43000, 100000, 3650),
np.random.normal(43500, 140000, 3650),
np.random.normal(48000, 70000, 3650)],
index=[1992, 1993, 1994, 1995])
plt.style.use('ggplot')
fig, ax = plt.subplots()
cmap = mpl.colors.LinearSegmentedColormap.from_list('blue_to_red', ['darkblue', 'darkred'])
df_mean = [df.iloc[index].mean() for index in range(0, len(df.index))]
colors = [color for color in cmap(np.linspace(0, 1, len(df.index)))]
colors = [colors for _ in zip(df_mean, colors)]
ax.bar(
df.index.tolist(), # X-Axis, would be 1992 to 1995
[df.iloc[index].mean() for index in range(0, len(df.index))], # List of mean values from 92-95
yerr=[(df.iloc[i].std() / math.sqrt(len(df.iloc[i]))) for i in range(len(df))], # Standard deviation, 92-95
color=cmap(colors)
)
fig.colorbar(cmap)
ax.set_title('Voting Results 1992 - 1995', fontsize=12)
plt.xticks(df.index, ('1992', '1993', '1994', '1995'))
ax.set_xlabel('Years')
ax.set_ylabel('Mean')
plt.show()
Thank you in advance!
Solution
First off, having a dataframe with the years as index and all the values into columns, is very unusual. (But pandas can work with that format, although a bit slowed down.)
If I understand correctly, you want to assign colors depending on where the mean of each years is, relative to the means of the other years. You'll need to create a norm (a mapping between the values and the range 0-1 needed by the colormap). Then, you can use the colormap and that norm to map each of the means to a color.
The years are converted to strings, so matplotlib will make a categorical x axis.
Here is some sample code to get you started:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
from matplotlib.cm import ScalarMappable
np.random.seed(12345)
df = pd.DataFrame([np.random.normal(32000, 200000, 3650),
np.random.normal(43000, 100000, 3650),
np.random.normal(43500, 140000, 3650),
np.random.normal(48000, 70000, 3650)],
index=[1992, 1993, 1994, 1995])
plt.style.use('ggplot')
fig, ax = plt.subplots()
cmap = mpl.colors.LinearSegmentedColormap.from_list('blue_to_red', ['darkblue', 'darkred'])
df_mean = df.mean(axis=1)
norm = plt.Normalize(df_mean.min(), df_mean.max())
colors = cmap(norm(df_mean))
ax.bar(df.index.astype(str),
df_mean,
yerr=df.std(axis=1) / np.sqrt(len(df.columns)),
color=colors)
fig.colorbar(ScalarMappable(cmap=cmap, norm=norm))
ax.set_title('Voting Results 1992 - 1995', fontsize=12)
ax.set_xlabel('Years')
ax.set_ylabel('Mean')
plt.tight_layout()
plt.show()
PS: The more standard way to represent the dataframe would be putting each year into a column. Then the mean for each column can be calculated as df.mean()
.
df = pd.DataFrame(np.array([np.random.normal(32000, 200000, 3650),
np.random.normal(43000, 100000, 3650),
np.random.normal(43500, 140000, 3650),
np.random.normal(48000, 70000, 3650)]).T,
columns=[1992, 1993, 1994, 1995])
df_mean = df.mean()
ax.bar(df.columns.astype(str),
df_mean,
yerr=df.std() / np.sqrt(len(df)),
color=colors)
Answered By - JohanC
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.