Issue
I have a dataframe, df, where I would like to fill in missing values in specific columns based on quarters.
Data
type date stat test
aa Q1 2022 20 1
aa Q2 2022 10 2
aa Q3 2022 30 1
bb Q1 2022 30 1
bb Q2 2022 10 1
Desired
type date stat test
aa Q1 2022 20 1
aa Q2 2022 10 2
aa Q3 2022 30 1
aa Q4 2022 0
bb Q1 2022 30 1
bb Q2 2022 10 1
bb Q3 2022 0
bb Q4 2022 0
Doing
Logic:
The pattern is Q1 2022, Q2 2022, Q3 2022 and Q4 2022.
If there is a 'break' in this pattern, the missing data should fill in accordingly with a stat
value of 0.
I believe I can create a dictionary and then combine the impute function
data = { "Q1 2022":0 ,
"Q2 2022":0 ,
"Q3 2022":0 ,
"Q4 2022":0 ,
}
df["type"].fillna("", inplace = True)
df["date"].fillna("", inplace = True) #input dictionary mapping
df["stat"].fillna("0", inplace = True)
Any suggestion is appreciated.
Solution
Create a new dataframe with all combinations of type
and dates
then merge it with your original dataframe. Finally, fill values according your rules:
from itertools import product
dates = ['Q1 2022', 'Q2 2022', 'Q3 2022', 'Q4 2022']
df1 = pd.DataFrame(product(df['type'].unique(), dates), columns=['type', 'date'])
df1 = df1.merge(df, how='left').fillna({'stat': 0, 'test': ''})
Output:
>>> df1
type date stat test
0 aa Q1 2022 20.0 1.0
1 aa Q2 2022 10.0 2.0
2 aa Q3 2022 30.0 1.0
3 aa Q4 2022 0.0
4 bb Q1 2022 30.0 1.0
5 bb Q2 2022 10.0 1.0
6 bb Q3 2022 0.0
7 bb Q4 2022 0.0
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.