Issue
This is related to another query (pandas loop through a list of dataframes (df) to do calculation, results in previous df to be referenced and used for next df), while with more specific details about when looping a list of dataframes(dfs), the calculation is with dependency between the two continuous dfs.
Calculation Requirement: i_f, r_f1, r_f2 and r_f3 in dfs, i_f compares to threshold of 0.95 (get IDs great than or equal to), the three r_fs compare to 0.3 (get IDs less than or equal to), four comparison results together determine eligible IDs in the first df, in next df, do the same thing, i_f compares to the same threshold of 0.95, however for the three r_fs, eligible IDs from previous DF compare to 0.3333, others compare to 0.3, again together determine eligible IDs in this df, so on so forth for the rest of the df in the list.
below is the df list examples and expected output in df1 for all IDs are 1,0,1, in df2 for all IDs are 1,0,1,1,1.
df1 = pd.DataFrame({'ID':[1,2,3],
'i_f':[0.967385562,0.869575345,1],
'r_f1':[0.18878187,0.327355797,0.100753051],
'r_f2':[0.047237449,0.056038276,0.189434048],
'r_f3':[0.095283998,0.2554309,0.138240321]})
df2 = pd.DataFrame({'ID':[1,2,3,4,5],
'i_f':[0.985,1,0.993297332,1,1],
'r_f1':[0.300009355,0.331788473,0.146077926,0.167329833,0.245227094],
'r_f2':[0.152293038,0.06668,0.196683885,0.101269411,0.02493159],
'r_f3':[0.111617815,0.042016,0.175285158,0.085330897,0.238370325]})
df_lst = [df1, df2]`
Solution
The logic is the same as your previous question. The threshold i_f
is constant so this is not a problem unlike r_f*
which must be calculated at each iteration. However the threshold is the same for all r_f*
:
for df in df_lst:
# boolean mask for i_f
m1 = df['i_f'].ge(0.95)
# boolean masks for r_f*
m2 = df.filter(like='r_f').le(df['ID'].map(thd).fillna(0.3).values, axis=0)
# compute the final result
res = (m1 & m2.all(axis=1)).astype(int)
df['r_results'] = res
# update the threshold dictionary for each group
thd |= dict(zip(df['ID'], np.where(res, 0.33, 0.3)))
print(df, end='\n\n')
# Output
ID i_f r_f1 r_f2 r_f3 r_results
0 1 0.967386 0.188782 0.047237 0.095284 1
1 2 0.869575 0.327356 0.056038 0.255431 0
2 3 1.000000 0.100753 0.189434 0.138240 1
ID i_f r_f1 r_f2 r_f3 r_results
0 1 0.985000 0.300009 0.152293 0.111618 1
1 2 1.000000 0.331788 0.066680 0.042016 0
2 3 0.993297 0.146078 0.196684 0.175285 1
3 4 1.000000 0.167330 0.101269 0.085331 1
4 5 1.000000 0.245227 0.024932 0.238370 1
Answered By - Corralien
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.