Issue
I have a dictionary of where the keys are numbers and the values are lists of strings. I want to create a dataframe column where the column values are the dictionary keys and the key is selected base on matching the value of another column in each row to an item in the dictionary value lists. See example code below: Sample starting dataframe and dictionary:
dict_x = {1:[a], 2:[b, c, e], 3:[d, f]
df = ['ID':[a, b, c, d, e, f]]
Desired output:
df = ['ID':[a, b, c, d, e, f], 'Number':[1, 2, 2, 3, 2, 3]]
I thought some sort of df['Number'] = df['ID'].apply(lambda x : ???)
would work but I'm struggling with the conditions here, and I tried writing some for loops but ran in to issues with only the last iteration of the loop being preserved when I wrote the column.
Solution
Simply invert the dictionary dict_x
by switching the role of key and value (loop over list elements to do that).
# setup dictionary properly
dict_x = {1:['a'], 2:['b', 'c', 'e'], 3:['d', 'f']}
df = pd.DataFrame({'ID':['a', 'b', 'c', 'd', 'e', 'f']})
# reverse dictionary
rev_dict_x = dict()
for k,v in dict_x.items():
for v_elem in v:
rev_dict_x[v_elem] = k
# replace elements
df['Number'] = df['ID'].replace(rev_dict_x)
>df
Note, that this assumes that the elements in the lists are unique, respectively. Otherwise, setting up the rev_dict_x
will overwrite the value to those keys.
Answered By - 7shoe
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.