Issue
I am trying to merge two dataframes together to replace possible missing values. For example:
df1:
Fruit Price
Apple 5
Banana 7
Orange 4
df2:
Fruit Price
Pear 3
result:
Fruit Price
Apple 5
Banana 7
Orange 4
Pear 3
I've used the concat method to do this and it works, however when both dataframes have the same column instead of having both duplicate values I want to use the data from the first df and ignore the second df. For example:
df1:
Fruit Price
Apple 5
Banana 7
Orange 4
Pear 6
df2:
Fruit Price
Pear 3
result:
Fruit Price
Apple 5
Banana 7
Orange 4
Pear 6
How can I do this so it works on both examples?
Solution
You could use concat
and chain drop_duplicates
on your Fruit column. Just always make sure to put the dataframe you want the values to keep in case of duplicated as the first parameter in concat
:
# Case 1
>>> pd.concat([df1b,df2]).drop_duplicates('Fruit')
Fruit Price
0 Apple 5
1 Banana 7
2 Orange 4
0 Pear 3
# Case 2
>>> pd.concat([df1,df2]).drop_duplicates('Fruit')
Fruit Price
0 Apple 5
1 Banana 7
2 Orange 4
3 Pear 6
Setup
df1 = pd.DataFrame({'Fruit':['Apple','Banana','Orange','Pear'],
'Price':[5,7,4,6]})
df1b = pd.DataFrame({'Fruit':['Apple','Banana','Orange'],
'Price':[5,7,4]})
df2 = pd.DataFrame({'Fruit':['Pear'],
'Price':[3]})
Answered By - sophocles
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.