Issue
I have data frame df1:
import pandas as pd
data1 = {'id': {0: 'A', 1: 'A', 2: 'A', 3: 'B', 4: 'C', 5: 'B'}, 'col1': {0: '7', 1: ' ', 2: '8', 3: '3', 4: '5', 5: '1'}}
df1 = pd.DataFrame(data1)
and df2 :
data2 = {'id': {0: 'A', 1: 'B', 2: 'C'}, 'testCol': {0: '0', 1: '4', 2: '1'}}
df2 = pd.DataFrame(data2)
by using pandas or numpy, How can compare df1['col1'] and df2['testCol'] for each id, and return max value in df2['testCol'] or in new column in df2?
result:
ID | testCol |
---|---|
A | 8 |
B | 4 |
C | 5 |
OR
ID | testCol | maxCol |
---|---|---|
A | 0 | 8 |
B | 4 | 4 |
C | 1 | 5 |
-df1 and df2 are examples.
Solution
Try:
x = (
pd.concat(
[df1.groupby("id")["col1"].max(), df2.set_index("id")["testCol"]],
axis=1,
)
.max(axis=1)
.astype(int)
.reset_index(name="testCol")
)
print(x)
Prints:
id testCol
0 A 8
1 B 4
2 C 5
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.