Tuesday, February 1, 2022

[FIXED] Plot datetime series as categorical data in matplotlib

February 01, 2022 matplotlib, plot, python No comments

Issue

TLDR: What I'm looking for is a way to plot a list of timestamps as equidistant datapoints, with mpl deciding which labels to show.

If equidistant plotting of timestamped datapoints is only possible by turning the timestamps into strings (and so plotting them as a categorical axis), my question could also be phrased: how can one get mpl to automatically drop labels from an overcrowded categorical axis?

Details:

I have a timeseries with monthly data, that I'd like to plot as a bar graph. My issue is that matplotlib.pyplot automatically plots this data on a time axis:

import matplotlib.pyplot as plt
import pandas as pd
fig, ax = plt.subplots(1, 1, )
s = pd.Series(range(3,7), pd.date_range('2021', freq='MS', periods=4))
ax.bar(s.index, s.values, 27) # width 27 days
ax.set_ylabel('income [Eur]')

Usually, this it what I want, but with monthly data it looks weird because Feb is significantly shorter. What I want is for the data to be plotted equi-distantly. Is there a way to force this?

Importantly, I want to retain the behaviour that e.g. only the second or third label is plotted once the x-axis becomes too crowded, without having to adjust it manually.

What I've tried or don't want to do:

I could make the gaps between the bars the same - by tweaking the width of the bars. However, I'm plotting revenue data [Eur], which means that an uneven bar width is misleading.
I could turn the timestamps into a string so that the data is plotted categorically:

s = pd.Series(range(3,7), pd.date_range('2021', freq='MS', periods=4))
x = [f'{ts.year}-{ts.month:02}' for ts in s.index]
ax.bar(x, s.values, 0.9) # width now as fraction of spacing between datapoints

However, this leads mpl to think each label must be plotted, which gets crowded:

s = pd.Series(range(3,17), pd.date_range('2021', freq='MS', periods=14))
x = [f'{ts.year}-{ts.month:02}' for ts in s.index]
ax.bar(x, s.values, 0.9) # width now as fraction of spacing between datapoints

Solution

You can space out your categorical ticks with the MaxNLocator.

Given your bigger Series sample with categorical labels:

s = pd.Series(range(3,17), pd.date_range('2021', freq='MS', periods=14))
x = [f'{ts.year}-{ts.month:02}' for ts in s.index]

fig, ax = plt.subplots()
ax.bar(x, s.values, 0.9)
ax.set_ylabel('income [Eur]')

Apply the MaxNLocator with a specified number of bins (or 'auto'):

from matplotlib.ticker import MaxNLocator

locator = MaxNLocator(nbins=5) # or nbins='auto'
ax.xaxis.set_major_locator(locator)

Answered By - tdy

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Tuesday, February 1, 2022

[FIXED] Plot datetime series as categorical data in matplotlib

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels