Issue
I have a dataframe of itemsets and I want to change to asymmetric binary attributes for each item
I have managed to do this but with a very lengthy process; creating a new list for each row in df and iterating through to check for each items.
list1= list(shopping.loc[0])
list2= list(shopping.loc[1])
list3= list(shopping.loc[2])
list4= list(shopping.loc[3])
list5= list(shopping.loc[4])
list6= list(shopping.loc[5])
list7= list(shopping.loc[6])
list8= list(shopping.loc[7])
list9= list(shopping.loc[8])
list10= list(shopping.loc[9])
ref= ['Milk', 'Burgers', 'Buns', 'Ketchup', 'Coals', 'Beer']
def item_in_list(shop, ref):
there= []
for item in ref:
if item in shop: there.append(1)
else: there.append(0)
return there
nl1 = item_in_list(list1, ref)
nl2 = item_in_list(list2, ref)
nl3 = item_in_list(list3, ref)
nl4 = item_in_list(list4, ref)
nl5 = item_in_list(list5, ref)
nl6 = item_in_list(list6, ref)
nl7 = item_in_list(list7, ref)
nl8 = item_in_list(list8, ref)
nl9 = item_in_list(list9, ref)
nl10 = item_in_list(list10, ref)
shop_bin= pd.DataFrame(np.array([nl1, nl2, nl3, nl4, nl5, nl6, nl7, nl8, nl9, nl10 ]),
columns=['Milk', 'Burgers', 'Buns', 'Ketchup', 'Coals', 'Beer'])
And then combining these lists into a new df to produce the df below
There must be a better way to do this to get to same result?
Solution
We can use:
#assert len(ref) == shopping.shape[1]
shop_bin = shopping.eq(ref, axis=1).astype(int).set_axis(ref, axis=1)
Answered By - ansev
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.