Issue
I'd like to use Matplotlib to plot a histogram over data that's been pre-counted. For example, say I have the raw data
data = [1, 2, 2, 3, 4, 5, 5, 5, 5, 6, 10]
Given this data, I can use
pylab.hist(data, bins=[...])
to plot a histogram.
In my case, the data has been pre-counted and is represented as a dictionary:
counted_data = {1: 1, 2: 2, 3: 1, 4: 1, 5: 4, 6: 1, 10: 1}
Ideally, I'd like to pass this pre-counted data to a histogram function that lets me control the bin widths, plot range, etc, as if I had passed it the raw data. As a workaround, I'm expanding my counts into the raw data:
data = list(chain.from_iterable(repeat(value, count)
for (value, count) in counted_data.iteritems()))
This is inefficient when counted_data
contains counts for millions of data points.
Is there an easier way to use Matplotlib to produce a histogram from my pre-counted data?
Alternatively, if it's easiest to just bar-plot data that's been pre-binned, is there a convenience method to "roll-up" my per-item counts into binned counts?
Solution
I used pyplot.hist's weights
option to weight each key by its value, producing the histogram that I wanted:
pylab.hist(counted_data.keys(), weights=counted_data.values(), bins=range(50))
This allows me to rely on hist
to re-bin my data.
Answered By - Josh Rosen
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.