Issue
I have two dataframes merged together, with two filled columns of integers and a third column with empty lists.
df = pd.DataFrame({'col1': ['z','x','c','v','b','n'], 'col2': [100, 200, 300, 400, 500, 600]})
df1 = pd.DataFrame({'col1': ['z','x','c','v','b','n'], 'col2': [10, 20, 300, 40, 50, 600]})
df['col3'] = np.empty((len(df), 0)).tolist()
df1['col3'] = np.empty((len(df), 0)).tolist()
df2 = df.merge(df1, on='col1', how='outer')
which yields this
col1 col2_x col3_x col2_y col3_y
0 z 10 [] 63 []
1 x 24 [] 1365 []
2 c 642 [] 356 []
3 v 462 [] 2 []
4 b 2454 [] 467 []
5 n 23 [] 23 []
I want to make some calculations if the conditions are correct, and if they are, add a value to each list in df2['col3_y']
.
condition = [
((df2['col2_y'] != df2['col2_x']) & (len(df2['col3_y']) < 1)),
((df2['col2_y'] != df2['col2_x']) & (len(df2['col3_y']) > 0))
]
action = [
(df2['col2_y'] - df2['col2_x'])/1000,
df2['col3_y'] + [(df2['col2_y'] - df2['col2_x'] - sum(df2['col3_y']))/1000]
]
df2['col3_y'] = np.select(condition, action)
But it throws me an error TypeError: unsupported operand type(s) for +: 'int' and 'list'.
EXPECTED
For each cell in the list column, if len(list) > 1
in the same index take the values of df['col2_y']
subtract from it the values of df['col2_x']
divide by 1000, and append the solution to the list,
elif len(list) > 0
in the same index take the values of df['col2_y']
subtract from it the values from df2['col2_x']
subtract the sum of the list df2['col3_y']
divide by 1000 and append the solution to the list.
and if the values in df2['col2_x'] == df2['col2_y']
do nothing.
col1 col2_x col3_x col2_y col3_y
0 z 100 [] 10 [-0.09]
1 x 200 [] 20 [-0.18]
2 c 300 [] 300 []
3 v 400 [] 40 [-0.36]
4 b 500 [] 50 [-0.45]
5 n 600 [] 600 []
Solution
I rewrote your code entirely to use a different solution:
df2['col3_y'] = df2.apply(lambda x: np.append(x['col3_y'],
(x['col2_y']-x['col2_x']-x['col3_y'].sum())/1000)
if x['col2_y']!=x['col2_x']
else x['col3_y'],
axis=1)
output after 1 iteration:
col1 col2_x col3_x col2_y col3_y
0 z 10 [] 63 [0.053]
1 x 24 [] 1365 [1.341]
2 c 642 [] 356 [-0.286]
3 v 462 [] 2 [-0.46]
4 b 2454 [] 467 [-1.987]
5 n 23 [] 23 []
output after 3 iterations:
col1 col2_x col3_x col2_y col3_y
0 z 10 [] 63 [0.053, 0.052947, 0.052894052999999996]
1 x 24 [] 1365 [1.341, 1.3396590000000002, 1.3383193409999998]
2 c 642 [] 356 [-0.286, -0.285714, -0.28542828600000003]
3 v 462 [] 2 [-0.46, -0.45954, -0.45908046]
4 b 2454 [] 467 [-1.987, -1.985013, -1.983027987]
5 n 23 [] 23 []
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.