Issue
I have a dataset where I want to make a new variable everytime 'Recording' number changes. I want the new variable to include the 'Duration' data for the specific 'Recording' and the previous data. So for the below table it would be:
Var1 = (3, 3, 3)
Var2 = (3, 3, 3, 4, 6)
Var2 = (3, 3, 3, 4, 6, 4, 3, 1, 4)
And so on. I have several dataset that can have different number of recordings (but always starting from 1) and different number of durations for each recording. Any help is greatly appreciated.
Recording | Duration |
---|---|
1 | 3 |
1 | 3 |
1 | 3 |
2 | 4 |
2 | 6 |
3 | 4 |
3 | 3 |
3 | 1 |
3 | 4 |
Solution
You can aggregate list
with cumualative sum for lists, then convert to tuples and dictionary:
d = df.groupby('Recording')['Duration'].agg(list).cumsum().apply(tuple).to_dict()
print (d)
{1: (3, 3, 3), 2: (3, 3, 3, 4, 6), 3: (3, 3, 3, 4, 6, 4, 3, 1, 4)}
print (d[1])
print (d[2])
print (d[3])
Your ouput is possible, but not recommended:
s = df.groupby('Recording')['Duration'].agg(list).cumsum().apply(tuple)
for k, v in s.items():
globals()[f'Var{k}'] = v
Answered By - jezrael
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.