Issue
i have created the below table in excel and now i am looking to replicate the same in python.
- settled accounts are further distributed basis the fixed percentages for subsequent months in column
I am looking to create the similar dataframe in python and have initiated with below approach :
data = {
'Assigned Accounts': [1428, 1415, 1398, 1402, 1468, 1503, 1694],
'Month': ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22']
}
df = pd.DataFrame(data)
df['Settled Accounts'] = int(data['Assigned Accounts']*0.17)
# Fixed percentage distribution
percentages = [0.46, 0.36, 0.06, 0.03, 0.02, 0.01, 0.01]
# Creating columns for subsequent months and distributing the settled debtors
months = ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23']
how it could be approached further.?
i am looking for final dataframe which gives the sum of columns vertically as below :
Month Total
Jan - 23 350
Feb - 23 335
Mar - 23 343
Apr - 23 341 and so on
** Updated Question :**
for below dataframe what will be the change in approach :
data = {
'Assigned Accounts': [674, 779, 812, 735, 753, 677, 628, 589, 570, 581, 550, 595, 596, 659, 754, 731, 739, 741, 766, 796, 752, 821, 818, 868, 850, 909, 927, 915],
'Month': ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23', 'Jul-23', 'Aug-23', 'Sep-23', 'Oct-23', 'Nov-23', 'Dec-23', 'Jan-24', 'Feb-24', 'Mar-24', 'Apr-24', 'May-24', 'Jun-24', 'Jul-24', 'Aug-24', 'Sep-24']
}
df = pd.DataFrame(data)
months: ['Jun-22', 'Jul-22', 'Aug-22', 'Sep-22', 'Oct-22', 'Nov-22', 'Dec-22', 'Jan-23', 'Feb-23', 'Mar-23', 'Apr-23', 'May-23', 'Jun-23', 'Jul-23', 'Aug-23', 'Sep-23', 'Oct-23', 'Nov-23', 'Dec-23', 'Jan-24', 'Feb-24', 'Mar-24', 'Apr-24', 'May-24', 'Jun-24', 'Jul-24', 'Aug-24', 'Sep-24']
rest the problem statement would remain the same.
Solution
If I understand correctly, you want to multiply each row value by your percentages and shift the values each month.
You can easily do this with numpy and broadcasting.
NB. In your picture, it looked like you used a ~23% factor (not 17%), I used that below for better comparison of the numbers.
# 23% of initial value
df['Settled Accounts'] = df['Assigned Accounts'].mul(0.23).astype(int)
# create output array
tmp = np.full((len(df), len(months)), np.nan)
# indexer for rows
a = np.arange(len(df))
# indexer for shifted values
col = (a[:,None] + np.arange(len(percentages))).ravel()
m = col<len(a)
idx = np.repeat(a, len(percentages))[m]
# broadcast the multiplication by "percent"
tmp[idx, col[m]] = (df['Settled Accounts'].to_numpy()[:,None]
* np.array(percentages)
).ravel()[m].astype(int)
# convert to DataFrame
tmp = pd.DataFrame(tmp, index=df.index, columns=months)
# add sum as new row
tmp.loc['sum'] = tmp.sum()
out = df.join(tmp, how='outer')
Output:
Assigned Accounts Month Settled Accounts Jun-22 Jul-22 Aug-22 Sep-22 Oct-22 Nov-22 Dec-22 Jan-23 Feb-23 Mar-23 Apr-23 May-23 Jun-23
0 1428.0 Jun-22 328.0 150.0 118.0 19.0 9.0 6.0 3.0 3.0 NaN NaN NaN NaN NaN NaN
1 1415.0 Jul-22 325.0 NaN 149.0 117.0 19.0 9.0 6.0 3.0 3.0 NaN NaN NaN NaN NaN
2 1398.0 Aug-22 321.0 NaN NaN 147.0 115.0 19.0 9.0 6.0 3.0 3.0 NaN NaN NaN NaN
3 1402.0 Sep-22 322.0 NaN NaN NaN 148.0 115.0 19.0 9.0 6.0 3.0 3.0 NaN NaN NaN
4 1468.0 Oct-22 337.0 NaN NaN NaN NaN 155.0 121.0 20.0 10.0 6.0 3.0 3.0 NaN NaN
5 1503.0 Nov-22 345.0 NaN NaN NaN NaN NaN 158.0 124.0 20.0 10.0 6.0 3.0 3.0 NaN
6 1694.0 Dec-22 389.0 NaN NaN NaN NaN NaN NaN 178.0 140.0 23.0 11.0 7.0 3.0 3.0
sum NaN NaN NaN 150.0 267.0 283.0 291.0 304.0 316.0 343.0 182.0 45.0 23.0 13.0 6.0 3.0
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.