Issue
I would like to distribute the payments based on the states as follows:
payment:
cust_id name date amount
0 A Edward 2021-01-01 3000
1 B Henry 2021-01-01 5000
2 C Ferth 2021-02-01 1000
state:
cust_id contract_id state1 state2 state3
0 A 1 Alabama Alaska Arizona
1 A 2 Indiana Alabama Nebraska
2 B 3 Alabama NaN Arizona
3 C 4 Alaska Nebraska NaN
4 C 5 NaN Maine Nebraska
The customer may have at least one contract and each contract covers different states. Each state must be counted and those that occur twice will be counted twice in the computation of the ratio and so on. The ratio will then be multiplied by the amount to get the allocated amount per state.
Output:
cust_id name date state ratio amount
0 A Edward 2021-01-01 Alabama 0.333333 1000
1 A Edward 2021-01-01 Alaska 0.166667 500
2 A Edward 2021-01-01 Arizona 0.166667 500
3 A Edward 2021-01-01 Indiana 0.166667 500
4 A Edward 2021-01-01 Nebraska 0.166667 500
5 B Henry 2021-01-01 Alabama 0.500000 2500
6 B Henry 2021-01-01 Arizona 0.500000 2500
7 C Ferth 2021-02-01 Alaska 0.250000 250
8 C Ferth 2021-02-01 Nebraska 0.500000 500
9 C Ferth 2021-02-01 Maine 0.250000 250
Solution
This can be achieved using df.melt
followed by df.groupby
and value_counts
with normalize=True
so that we can flatten the states for each customer and based on the no of appearances, we get the pct share of each state. Then merge with the payments dataframe and finally multiply the amount
with the pct share to get the new amount:
Solution:
u = (state.melt(['cust_id','contract_id'],value_name='state')
.groupby("cust_id")['state'].value_counts(normalize=True)
.reset_index(name='ratio'))
out = payment.merge(u,on='cust_id')
out['new_amount'] = out['amount']*out['ratio']
Output:
print(out)
cust_id name date amount state ratio new_amount
0 A Edward 2021-01-01 3000 Alabama 0.333333 1000.0
1 A Edward 2021-01-01 3000 Alaska 0.166667 500.0
2 A Edward 2021-01-01 3000 Arizona 0.166667 500.0
3 A Edward 2021-01-01 3000 Indiana 0.166667 500.0
4 A Edward 2021-01-01 3000 Nebraska 0.166667 500.0
5 B Henry 2021-01-01 5000 Alabama 0.500000 2500.0
6 B Henry 2021-01-01 5000 Arizona 0.500000 2500.0
7 C Ferth 2021-02-01 1000 Alaska 0.250000 250.0
8 C Ferth 2021-02-01 1000 Maine 0.250000 250.0
9 C Ferth 2021-02-01 1000 Nebraska 0.500000 500.0
Answered By - anky
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.