Issue
I would like to compare two histograms by having the Y axis show the percentage of each column from the overall dataset size instead of an absolute value. Is that possible? I am using Pandas and matplotlib. Thanks
Solution
The density=True
(normed=True
for matplotlib < 2.2.0
) returns a histogram for which np.sum(pdf * np.diff(bins))
equals 1. If you want the sum of the histogram to be 1 you can use Numpy's histogram() and normalize the results yourself.
x = np.random.randn(30)
fig, ax = plt.subplots(1,2, figsize=(10,4))
ax[0].hist(x, density=True, color='grey')
hist, bins = np.histogram(x)
ax[1].bar(bins[:-1], hist.astype(np.float32) / hist.sum(), width=(bins[1]-bins[0]), color='grey')
ax[0].set_title('normed=True')
ax[1].set_title('hist = hist / hist.sum()')
Btw: Strange plotting glitch at the first bin of the left plot.
Answered By - Rutger Kassies
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.