Issue
TLDR: What I'm looking for is a way to plot a list of timestamps as equidistant datapoints, with mpl
deciding which labels to show.
If equidistant plotting of timestamped datapoints is only possible by turning the timestamps into strings (and so plotting them as a categorical axis), my question could also be phrased: how can one get mpl
to automatically drop labels from an overcrowded categorical axis?
Details:
I have a timeseries with monthly data, that I'd like to plot as a bar graph. My issue is that matplotlib.pyplot
automatically plots this data on a time axis:
import matplotlib.pyplot as plt
import pandas as pd
fig, ax = plt.subplots(1, 1, )
s = pd.Series(range(3,7), pd.date_range('2021', freq='MS', periods=4))
ax.bar(s.index, s.values, 27) # width 27 days
ax.set_ylabel('income [Eur]')
Usually, this it what I want, but with monthly data it looks weird because Feb is significantly shorter. What I want is for the data to be plotted equi-distantly. Is there a way to force this?
Importantly, I want to retain the behaviour that e.g. only the second or third label is plotted once the x-axis becomes too crowded, without having to adjust it manually.
What I've tried or don't want to do:
I could make the gaps between the bars the same - by tweaking the width of the bars. However, I'm plotting revenue data [Eur], which means that an uneven bar width is misleading.
I could turn the timestamps into a string so that the data is plotted categorically:
s = pd.Series(range(3,7), pd.date_range('2021', freq='MS', periods=4))
x = [f'{ts.year}-{ts.month:02}' for ts in s.index]
ax.bar(x, s.values, 0.9) # width now as fraction of spacing between datapoints
However, this leads mpl
to think each label must be plotted, which gets crowded:
s = pd.Series(range(3,17), pd.date_range('2021', freq='MS', periods=14))
x = [f'{ts.year}-{ts.month:02}' for ts in s.index]
ax.bar(x, s.values, 0.9) # width now as fraction of spacing between datapoints
Solution
You can space out your categorical ticks with the MaxNLocator
.
Given your bigger Series sample with categorical labels:
s = pd.Series(range(3,17), pd.date_range('2021', freq='MS', periods=14)) x = [f'{ts.year}-{ts.month:02}' for ts in s.index] fig, ax = plt.subplots() ax.bar(x, s.values, 0.9) ax.set_ylabel('income [Eur]')
Apply the
MaxNLocator
with a specified number of bins (or'auto'
):from matplotlib.ticker import MaxNLocator locator = MaxNLocator(nbins=5) # or nbins='auto' ax.xaxis.set_major_locator(locator)
Answered By - tdy
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.