Issue
I've two tables :
- df1 is empty with columns = {'A', 'B', 'A', 'C', C', 'A'} and 62 rows.
- df2 is not empty with columns = {'A', 'B', 'C', 'D', 'E', 'F'} and 62 rows.
I would like to fill df1 rows with df2 rows based on column name (note that df1 and df2 are just for the example, my tables have more than 3,000 columns).
I'm currently using the following code:
for col in df1:
if col in df2:
df1[col] = df2[col]
So I was wondering, is it possible to do that without looping through columns ? As it would be done with a pd.merge or pd.concat. I think that looping through columns is increasing my running time a lot when df1 has a lot of columns.
I've also tried using merge or update but as I've duplicate column names in df1 it seems not working.
Thanks in advance.
Solution
You can reindex
and update
for in place modification:
df1.update(df2.reindex(df1.columns, axis=1))
example:
df1 = pd.DataFrame(columns=['A', 'B', 'A', 'C', 'C', 'A'],
index=range(4))
df2 = pd.DataFrame(np.arange(24).reshape(4,6),
columns=['A', 'B', 'C', 'D', 'E', 'F'])
df1.update(df2.reindex(df1.columns, axis=1))
print(df1)
output:
A B A C C A
0 0 1 0 2 2 0
1 6 7 6 8 8 6
2 12 13 12 14 14 12
3 18 19 18 20 20 18
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.