Issue
I have following initial dataframe:
Post ID | Submission_Date | Flair | |
---|---|---|---|
0 | row1 | 01.12.2020 | NaN |
1 | row2 | 03.12.2020 | Discussion |
2 | row3 | 03.12.2020 | News |
3 | row4 | 03.12.2020 | Discussion |
4 | row5 | 06.12.2020 | Due Diligence |
5 | row6 | 07.12.2020 | Discussion |
6 | row7 | 31.12.2020 | Discussion |
1 | row8 | 01.01.2021 | Hedge Fund Tears |
- Multiple Dates with missing dates in between
- Multiple categories on dates
I grouped the dataframe with:
import pandas as pd
import numpy as np # for test data
data = {'Post ID': ['row1', 'row2', 'row3', 'row4', 'row5', 'row6', 'row7', 'row8'], 'Submission_Date': ['01.12.2020', '03.12.2020', '03.12.2020', '03.12.2020', '06.12.2020', '07.12.2020', '31.12.2020', '01.01.2021'], 'Flair': [np.nan, 'Discussion', 'News', 'Discussion', 'Due Diligence', 'Discussion', 'Discussion', 'Hedge Fund Tears']}
df = pd.DataFrame(data)
df['Submission_Date'] = pd.to_datetime(df['Submission_Date'], format = "%Y-%m-%d %H:%M:%S").dt.strftime('%Y-%m-%d')
df = df.groupby('Submission_Date')['Flair'].value_counts(normalize=True).unstack()
The result is this:
I want to fill the dates with "empty" bars and plot like this
I tried already this:
fig, ax = plt.subplots(figsize=(20,10))
df.plot(kind='bar',ax=ax, stacked=True, width=1)
plt.xlabel('Submission_Date', fontsize=16)
plt.ylabel('Ratio of Flairs used', fontsize=16)
But the dates are incorrect since the empty days are not displayed
Solution
Assuming this input as df2
(the output of your groupby
operation):
Flair Discussion Due Diligence Hedge Fund Tears News
Submission_Date
01.01.2021 NaN NaN 1.0 NaN
03.12.2020 0.666667 NaN NaN 0.333333
06.12.2020 NaN 1.0 NaN NaN
07.12.2020 1.000000 NaN NaN NaN
31.12.2020 1.000000 NaN NaN NaN
You can reindex
from pd.date_range
:
df2.index = pd.to_datetime(df2.index, format='%d.%m.%Y')
df2 = df2.reindex(pd.date_range(df2.index.min(), df2.index.max()))
df2.index = df2.index.strftime('%Y-%m-%d')
Flair Discussion Due Diligence Hedge Fund Tears News
2020-12-03 0.666667 NaN NaN 0.333333
2020-12-04 NaN NaN NaN NaN
2020-12-05 NaN NaN NaN NaN
2020-12-06 NaN 1.0 NaN NaN
2020-12-07 1.000000 NaN NaN NaN
...
2020-12-30 NaN NaN NaN NaN
2020-12-31 1.000000 NaN NaN NaN
2021-01-01 NaN NaN 1.0 NaN
graphical outcome (small size):
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.