Issue
I have been using scrapy for MichaelKors.com. Uptil now I have used SKUs from window.initial_state to get all the attributes and relevant information. However, there are certain webpages that I am unable to scrape, such as: https://www.michaelkors.com/zip-hoodie-embellished-skirt-manhattan-crossbody-goldie-moto-boot/_/L-MSTR101179 It doesn't have SKUs so I tried getting it directly like this:
desc = response.xpath('//p[@class="look-description-desktop hide-on-mobile"]/text()').getall()
However, it is returning nothing. What other attributes or aspect do you look into if you want to scrape specific information? I am a newbie so I'm quite unclear of where to go from here.
Solution
The xpath you wrote in your question gives you the description (at least when you render the page). To check how scrapy sees the webpage, you can do this in command line:
scrapy shell 'https://www.michaelkors.com/zip-hoodie-embellished-skirt-manhattan-crossbody-goldie-moto-boot/_/L-MSTR101179'
view(response)
You'll see there that you can find the description as follows:
response.xpath('//*[@property="og:description"]/@content').extract_first()
Answered By - Wim Hermans
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.