Issue
I am trying to make a normalized histogram in matplotlib, however I want it normalized such that the total area will be 1000. Is there a way to do this?
I know to get it normalized to 1, you just have to include density=True,stacked=True
in the argument of plt.hist()
. An equivalent solution would be to do this and multiply the height of each column by 1000, if that would be more doable than changing what the histogram is normalized to.
Thank you very much in advance!
Solution
The following approach uses np.histogram
to calculate the counts for each histogram bin. Using 1000 / total_count / bin_width
as normalization factor, the total area will be 1000. On the contrary, to get the sum of all bar heights to be 1000, a factor of 1000 / total_count
would be needed.
plt.bar
is used to display the end result.
The example code calculates the same combined histogram with density=True,
to compare it with the new histogram summing to 1000.
import matplotlib.pyplot as plt
import numpy as np
data = [np.random.randn(100) * 5 + 10, np.random.randn(300) * 4 + 14, np.random.randn(100) * 3 + 17]
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(12, 4))
ax1.hist(data, stacked=True, density=True)
ax1.set_title('Histogram with density=True')
xmin = min([min(d) for d in data])
xmax = max([max(d) for d in data])
bins = np.linspace(xmin, xmax, 11)
bin_width = bins[1] - bins[0]
counts = [np.histogram(d, bins=bins)[0] for d in data]
total_count = sum([sum(c) for c in counts])
# factor = 1000 / total_count # to sum to 1000
factor = 1000 / total_count / bin_width # for an area of 1000
thousands = [c * factor for c in counts]
bottom = 0
for t in thousands:
ax2.bar(bins[:-1], t, bottom=bottom, width=bin_width, align='edge')
bottom += t
ax2.set_title('Histogram with total area of 1000')
plt.show()
Answered By - JohanC
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.