Issue
I'm interested in plotting the probability distribution of a set of points which are distributed as a power law. Further, I would like to use logarithmic binning to be able to smooth out the large fluctuations in the tail. If I just use logarithmic binning, and plot it on a log log scale, such as
pl.hist(MyList,log=True, bins=pl.logspace(0,3,50))
pl.xscale('log')
for example, then the problem is that the larger bins account for more points, i.e. the heights of my bins are not scaled by bin size.
Is there a way to use logarithmic binning, and yet make python scale all the heights by the size of the bin? I know I can probably do this in some roundabout fashion manually, but it seems like this should be a feature that exists, but I can't seem to find it. If you think histograms are fundamentally a bad way to represent my data and you have a better idea, then I'd love to hear that too.
Thanks!
Solution
Matplotlib won't help you much if you have special requirements of your histograms. You can, however, easily create and manipulate a histogram with numpy.
import numpy as np
from matplotlib import pyplot as plt
# something random to plot
data = (np.random.random(10000)*10)**3
# log-scaled bins
bins = np.logspace(0, 3, 50)
widths = (bins[1:] - bins[:-1])
# Calculate histogram
hist = np.histogram(data, bins=bins)
# normalize by bin width
hist_norm = hist[0]/widths
# plot it!
plt.bar(bins[:-1], hist_norm, widths)
plt.xscale('log')
plt.yscale('log')
Obviously when you do present your data in a non-obvious way like this, you have to be very careful about how to label your y axis properly and write an informative figure caption.
Answered By - tjollans
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.