Issue
I am using scrapy-splash to crawl this website and the spider is giving "[twisted] CRITICAL: Unhandled error in Deferred:"
Tried everything on the stack overflow and other websites
Code of my spider
class DarazspidySpider(scrapy.Spider):
name = 'darazspidy'
def start_requests(self):
url = 'https://www.daraz.pk/smartphones/'
SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})
def parse(self, response):
for phone in response.xpath('//div[@class="c5TXIP"]'):
yield {
'Name',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price',
phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
Solution
You are yielding a set, not a dictionary. Can you try to yield a dictionary instead?
Your set creation will fail because you can't add lists into a set.
Try something like this instead:
def parse(self, response):
for phone in response.xpath('//div'):
yield {
'Name': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c16H9d", " " ))]//a').extract(),
'price': phone.xpath('.//*[contains(concat( " ", @class, " " ), concat( " ", "c13VH6", " " ))]').extract(),
}
You probably also need to yeild your splash request:
yield SplashRequest(url=url, callback=self.parse,
endpoint='render.html', args={'wait': 0.5})
Answered By - Guillaume
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.