Issue
I am trying to get attributes of products that may overlap.
Given the title, manufacturer, description, I need to know whether the product is a Jeans or something else and further more, whether it’s a or Skinny Jeans or other types of Jeans. Going through the scikit-learn exercises it seems I can only predict one category at a time, which doesn’t apply to my case. Any suggestion on how to tackle the problem?
What I have in mind right now is to have a training data for each category ex:
Jeans = ['desc of jeans 1', 'desc of jeans 2']
Skinny Jeans ['desc of skinny jeans 1', 'desc of skinny jeans 2']
with this training data, I would then ask the probability of a given unknown product and expect this kind of answer in return in percentage of matching:
Unknown_Product_1 = {
'jeans': 93,
'skinny_jeans': 80,
't-shirt': 5
}
Am I way off base? If this is a correct path to take, if so, how do I achieve it?
Solution
You are probably describing a task called multi-label learning or multi-label classification.
A key difference between this task and the standard classification task is that by learning a relationship between the labels, you can sometimes obtain better performance than if you train many independent standard classifiers.
Answered By - user1149913
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.