Issue
I am working on making a dynamic table by adding columns in same row, however, range of columns is determined based on two columns colum_start and colum_stop:
df = pd.DataFrame({
'Name': ['Apple', 'Banana', 'Orange', 'Cherry', 'Egg', 'Cheese'],
'ID': ['F1', 'F1', 'F1', 'F1', 'V1', 'V2'],
'202101': [1, 10, 20, 30, 40, 50],
'202102': [20, 15, 12, 18, 32, 12],
'202103': [3, 11, 25, 32, 13, 4],
'202104': [32, 11, 9, 82, 2, 1],
'202105': [9, 5, 11, 11, 2, 5],
'colum_start ': [202102, 202101, 202102, 202103, 202101, 202103],
'colum_stop': [202105, 202103, 202105, 202104, 202102, 202105],
})
df
I want to create a sum_column that sums the columns based on colum_start and colum_stop in column name.
desired output:
Appriciate your help!!
Solution
Use numpy broadcasting for select values between both columns by compare by columns, and chained cumulative sums compared for 1
:
a = df['colum_start'].astype(str).to_numpy()[:, None]
b = df['colum_stop'].astype(str).to_numpy()[:, None]
c = df.columns.to_numpy()
mask = (np.cumsum(c == a, axis=1) == 1) & (np.cumsum(c[::-1] == b, axis=1)[:, ::-1] == 1)
df['SUM_OF_RANGE'] = df.where(mask, 0).sum(axis=1)
print (df)
Name ID 202101 202102 202103 202104 202105 colum_start \
0 Apple F1 1 20 3 32 9 202102
1 Banana F1 10 15 11 11 5 202101
2 Orange F1 20 12 25 9 11 202102
3 Cherry F1 30 18 32 82 11 202103
4 Egg V1 40 32 13 2 2 202101
5 Cheese V2 50 12 4 1 5 202103
colum_stop SUM_OF_RANGE
0 202105 64
1 202103 36
2 202105 57
3 202104 114
4 202102 72
5 202105 10
If small data and perfromance is not important use lsit comprehension with DataFrame.loc
:
c = df.columns.to_numpy()
df['SUM_OF_RANGE'] = [df.loc[i, str(a):str(b)].sum()
for i, a, b in zip(df.index, df['colum_start'], df['colum_stop'])]
print (df)
Name ID 202101 202102 202103 202104 202105 colum_start \
0 Apple F1 1 20 3 32 9 202102
1 Banana F1 10 15 11 11 5 202101
2 Orange F1 20 12 25 9 11 202102
3 Cherry F1 30 18 32 82 11 202103
4 Egg V1 40 32 13 2 2 202101
5 Cheese V2 50 12 4 1 5 202103
colum_stop SUM_OF_RANGE
0 202105 64
1 202103 36
2 202105 57
3 202104 114
4 202102 72
5 202105 10
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.