Issue
Taking a tip from another thread (@EnricoGiampieri's answer to cumulative distribution plots python), I wrote:
# plot cumulative density function of nearest nbr distances
# evaluate the histogram
values, base = np.histogram(nearest, bins=20, density=1)
#evaluate the cumulative
cumulative = np.cumsum(values)
# plot the cumulative function
plt.plot(base[:-1], cumulative, label='data')
I put in the density=1
from the documentation on np.histogram
, which says:
Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.
Well, indeed, when plotted, they don't sum to 1. But, I do not understand the "bins of unity width." When I set the bins to 1, of course, I get an empty chart; when I set them to the population size, I don't get a sum to 1 (more like 0.2). When I use the 40 bins suggested, they sum to about .006.
Can anybody give me some guidance? Thanks!
Solution
You need to make sure your bins are all width 1. That is:
np.all(np.diff(base)==1)
To achieve this, you have to manually specify your bins:
bins = np.arange(np.floor(nearest.min()),np.ceil(nearest.max()))
values, base = np.histogram(nearest, bins=bins, density=1)
And you get:
In [18]: np.all(np.diff(base)==1)
Out[18]: True
In [19]: np.sum(values)
Out[19]: 0.99999999999999989
Answered By - perimosocordiae
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.