Taking a tip from another thread (@EnricoGiampieri's answer to cumulative distribution plots python), I wrote:
# plot cumulative density function of nearest nbr distances
# evaluate the histogram
values, base = np.histogram(nearest, bins=20, density=1)
#evaluate the cumulative
cumulative = np.cumsum(values)
# plot the cumulative function
plt.plot(base[:-1], cumulative, label='data')
I put in the density=1
from the documentation on np.histogram
, which says:
Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function.
Well, indeed, when plotted, they don't sum to 1. But, I do not understand the "bins of unity width." When I set the bins to 1, of course, I get an empty chart; when I set them to the population size, I don't get a sum to 1 (more like 0.2). When I use the 40 bins suggested, they sum to about .006.
Can anybody give me some guidance? Thanks!
You need to make sure your bins are all width 1. That is:
To achieve this, you have to manually specify your bins:
bins = np.arange(np.floor(nearest.min()),np.ceil(nearest.max()))
values, base = np.histogram(nearest, bins=bins, density=1)
And you get:
In [18]: np.all(np.diff(base)==1)
Out[18]: True
In [19]: np.sum(values)
Out[19]: 0.99999999999999989
Answered By - perimosocordiae
Post a Comment
Note: Only a member of this blog may post a comment.