Issue
I have a dataframe, dfTest, with three columns. One has year information, one has month information, and one is a list of dates.
Date1_Year Date1_Month Datelist
2020 5 ['2020-04','2020-01']
2020 9 [‘2020-08’, ‘2020-07’,’2020-06’]
2021 9 ['2021-09']
2020 12 ['2020-10','2020-01']
1999 03 NaN
Taking the first row as an example, I would like to calculate the number of months between the date represented by Date1_Year and Date2_Year (that is '2020-05') and each of the dates in DateList. The desired output would be:
Date1_Year Date1_Month DateList Month_Diff
2020 5 ['2020-04','2020-01'] [1,4]
2020 9 ['2020-08','2020-07','2020-06'] [1,2,3]
2021 9 ['2021-09'] [0]
2020 12 ['2020-10','2020-01'] [2,11]
1999 03 NaN
This is my code:
for i in range(len(dfTest)):
My_Date_List = dfTest.loc[i,'DateList']
Date1Months = dfTest.loc[i,'Date1_Year']*12 + dfTest.loc[i,'Date1_Month']
months_since_date1 = []
if isinstance(My_Date_List,list):
if len(My_Date_List) >= 1:
for j in range(len(My_Date_List)):
Date2_Year = int(My_Date_List[j][:4])
Date2_Month = int(My_Date_list[j][5:])
Date2Months = Date2_Year*12 + Date2_Month
if Date1Months < Date2Months:
months_since_date1.append(-999)
else:
months_since_date1.append(Date1Months-Date2Months)
dfTest.loc[i,'Month_Diff'] = months_since_date1
I was able to test that months_since_date1 was being created as I expected, but when I add the line of code
dfTest.loc[i,'Month_Diff'] = months_since_date1
I get the message "ValueError: must have equal len keys and value when setting an iterable"
Please advise! What am I doing wrong?
Solution
Previous initialization via dfTest['Month_Diff'] = [[]] * len(dfTest)
and later assignment via dfTest['Month_Diff'][i] = months_since_date1
do the trick, resulting in
#initialize the new column
dfTest['Month_Diff'] = [[]] * len(dfTest)
for i in range(len(dfTest)):
My_Date_List = dfTest.loc[i,'DateList']
Date1Months = dfTest.loc[i,'Date1_Year']*12 + dfTest.loc[i,'Date1_Month']
months_since_date1 = []
if isinstance(My_Date_List,list):
if len(My_Date_List) >= 1:
for j in range(len(My_Date_List)):
Date2_Year = int(My_Date_List[j][:4])
Date2_Month = int(My_Date_List[j][5:])
Date2Months = Date2_Year*12 + Date2_Month
if Date1Months < Date2Months:
months_since_date1.append(-999)
else:
months_since_date1.append(Date1Months-Date2Months)
#assign the resulting list
dfTest['Month_Diff'][i] = months_since_date1
Answered By - flyakite
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.