Issue
I want to add a new column a3 to my dataframe df: If the strings of "b" contain strings of "b2" from dataframe df2, the new column a3 should append values from a2 of df2.
first dataframe df:
d = {'a': [100, 300], 'b': ["abc", "dfg"]}
df = pd.DataFrame(data=d, index=[1, 2])
print(df)
a b
1 100 abc
2 300 dfg
second dataframe df2:
d2 = {'a2': ["L1", "L2", "L3"], 'b2': ["bc", "op", "fg"]}
df2 = pd.DataFrame(data=d2, index=[1, 2, 3])
print(df2)
a2 b2
1 L1 bc
2 L2 op
3 L3 fg
The output should look like this:
print(df)
a b a3
1 100 abc L1
2 300 dfg L3
I tried a nested for loop, which did not work.
for i in df.b:
for ii in df2.b2:
for iii in df2.a3:
if ii in i:
df["a3"]=iii
Solution
You need to test all combinations. You could still take advantage of pandas vector str.contains
:
common = (pd.DataFrame({x: df['b'].str.contains(x) for x in df2['b2']})
.replace({False: pd.NA})
.stack()
.reset_index(level=1, name='b2')['level_1'].rename('b2')
)
# 1 bc
# 2 fg
# Name: b2, dtype: object
df.join(common).merge(df2, on='b2')
output:
a b b2 a2
0 100 abc bc L1
1 300 dfg fg L3
Answered By - mozway
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.