Issue
I have a dataframe, df, where I would like output to reveal what dates are missing as well as what ids are missing. I am verifying against ids of aa,bb,cc and dd.
Data
id date pwr
aa Q1.22 10
aa Q1.22 1
aa Q2.22 1
aa Q2.22 5
bb Q1.22 5
bb Q1.22 1
bb Q2.22 1
bb Q2.22 1
cc Q1.22 2
cc Q2.22 2
Desired
id date
aa Q3.22
aa Q4.22
bb Q3.22
bb Q4.22
cc Q3.22
cc Q4.22
dd Q1.22
dd Q2.22
dd Q3.22
dd Q4.22
Doing
pd.date_range(start = '2022-01-01', end = '2023-01-01' ).difference(df.index)
I believe I can do something like this. I unsure how to implement with quarters. I am still researching. Any suggestion is appreciated.
Solution
Try with itertools
then merge
import itertools
alldf = pd.DataFrame(list(itertools.product(['aa','bb','cc','dd'],['Q1.22','Q2.22','Q3.22','Q4.22'])),columns=['id','date'])
out = df.merge(alldf,indicator=True,how='right')
missing = out[out['_merge'] =='right_only']
missing
Out[159]:
id date pwr _merge
4 aa Q3.22 NaN right_only
5 aa Q4.22 NaN right_only
10 bb Q3.22 NaN right_only
11 bb Q4.22 NaN right_only
14 cc Q3.22 NaN right_only
15 cc Q4.22 NaN right_only
16 dd Q1.22 NaN right_only
17 dd Q2.22 NaN right_only
18 dd Q3.22 NaN right_only
19 dd Q4.22 NaN right_only
Answered By - BENY
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.