Issue
I have this df:
import pandas as pd
df = pd.DataFrame({'Time' : ['s_1234','s_1234', 's_1234', 's_5678', 's_8998','s_8998' ],
'Control' : ['A', '', '','B', 'C', ''],
'tot_1' : ['1', '1', '1','1', '1', '1'],
'tot_2' : ['2', '2', '2','2', '2', '2']})
--------
Time Control tot_1 tot_2
0 1234 A 1 2
1 1234 A 1 2
2 1234 1 2
3 5678 B 1 2
4 8998 C 1 2
5 8998 1 2
I would like each time an equal time value to be merged into one column. I would also like the "tot_1" and "tot_2" columns to be added together. And finally I would like to keep checking if present. Like:
Time Control tot_1 tot_2
0 1234 A 3 6
1 5678 B 1 2
2 8998 C 2 4
Solution
Your data is different then the example df.
construct df:
import pandas as pd
df = pd.DataFrame({'Time' : ['s_1234','s_1234', 's_1234', 's_5678', 's_8998','s_8998' ],
'Control' : ['A', '', '','B', 'C', ''],
'tot_1' : ['1', '1', '1','1', '1', '1'],
'tot_2' : ['2', '2', '2','2', '2', '2']})
df.Time = df.Time.str.split("_").str[1]
df = df.astype({"tot_1": int, "tot_2": int})
Group by Time and aggregate the values.
df.groupby('Time').agg({"Control": "first", "tot_1": "sum", "tot_2": "sum"}).reset_index()
Time Control tot_1 tot_2
0 1234 A 3 6
1 5678 B 1 2
2 8998 C 2 4
EDIT for comment: Not sure if thats the best way to do it, but you could construct your agg information like this:
n = 2
agg_ = {"Control": "first"} | {f"tot_{i+1}": "sum" for i in range(n)}
df.groupby('Time').agg(agg_).reset_index()
Answered By - bitflip
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.