Issue
I'm new to Python. I hope you can help me. I have a dataframe with two columns. The first column is called dates and the second column is filled with numbers. The dataframe has 351 row.
dates numbers
01.03.2019 5
02.03.2019 8
...
20.02.2020 3
21.02.2020 2
I want the whole first column to be on the x axis from. I tried to plot it like this:
graph = FinalDataframe.plot(figsize=(12, 8))
graph.legend(loc='upper center', bbox_to_anchor=(0.5, -0.075), ncol=4)
graph.set_xticklabels(FinalDataframe['dates'])
plt.show()
But on the x axis are only the first few values from the column instead of the whole column. Furthermore, they are not correlated to the data from the second column.
Any suggestions?
Thank you in advance!
Solution
Your issue is that x ticks are generated automatically, and spaced out to be readable. However you the tell matplotlib to use all the labels. The simple fix is to tell him to use one tick label per entry, but that’s going to make your x-axis unreadable:
graph.set_xticks(range(len(FinalDataframe['dates'])))
Now you could space them out manually:
graph.set_xticks(range(0, len(FinalDataframe['dates']), 61))
graph.set_xticklabels(FinalDataframe['dates'][::61])
However the best result to plot dates on the x-axis is still to use pandas’ built-in date objects. We can do this with pd.to_datetime
This will also allow pandas to know where to place points on the x-axis, by specifying that you want the x-axis to be the dates. In that way, if dates are not sorted or missing, the gaps will be skipped properly, and points will be above the ordinate of the right date.
I’m first recreating a dataframe that looks like what you posted:
>>> df = pd.DataFrame({'dates': pd.date_range('20190301', '20200221', freq='D').strftime('%d.%m.%Y'), 'numbers': np.random.randint(0, 10, 358)})
>>> df
dates numbers
0 01.03.2019 2
1 02.03.2019 2
2 03.03.2019 5
3 04.03.2019 4
4 05.03.2019 3
.. ... ...
353 17.02.2020 2
354 18.02.2020 1
355 19.02.2020 2
356 20.02.2020 3
357 21.02.2020 1
(This should be the same as FinalDataFrame
, or if your dates are the index, then it’s the same as FinalDataFrame.reset_index()
)
Now I’m converting the dates:
>>> df['dates'] = pd.to_datetime(df['dates'], format='%d.%m.%Y')
>>> df
dates numbers
0 2019-03-01 2
1 2019-03-02 2
2 2019-03-03 5
3 2019-03-04 4
4 2019-03-05 3
.. ... ...
353 2020-02-17 2
354 2020-02-18 1
355 2020-02-19 2
356 2020-02-20 3
357 2020-02-21 1
You can check your columns contain dates and not string representations of dates by checking their dtypes:
>>> df.dtypes
dates datetime64[ns]
numbers int64
Finally plotting:
>>> ax = df.plot(x='dates', y='numbers', figsize=(12, 8))
>>> ax.legend(loc='upper center', bbox_to_anchor=(0.5, -0.075), ncol=4)
<matplotlib.legend.Legend object at 0x7fc8c24fd4f0>
>>> plt.show()
Legends are taken care of automatically. This is what you get:
Answered By - Cimbali
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.