Issue
I want to check if a string in a Pandas column contains a word from a dictionary and if there is a match need to create a new column with the appropriate dictionary key as the column value. eg. dict = {'Car': ['Merc', 'BMW', 'Ford, 'Suzuki'], 'MotorCycle': ['Harley', 'Yamaha', 'Triump']}
df
Person | Sentence |
---|---|
A | 'He drives a Merc' |
B | 'He rides a Harley' |
should return
Person | Sentence | Vehicle |
---|---|---|
A | 'He drives a Merc' | 'Car' |
B | 'He rides a Harley' | "Motorcycle |
Solution
One solution is to create reversed dictionary from the dct
and search for right word using str.split
:
dct = {
"Car": ["Merc", "BMW", "Ford", "Suzuki"],
"MotorCycle": ["Harley", "Yamaha", "Triump"],
}
dct_inv = {i: k for k, v in dct.items() for i in v}
def find_word(x):
for w in x.strip(" '").split():
if w in dct_inv:
return dct_inv[w]
return None
df["Vehicle"] = df["Sentence"].apply(find_word)
print(df)
Prints:
Person Sentence Vehicle
0 A 'He drives a Merc' Car
1 B 'He rides a Harley' MotorCycle
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.