Issue
I tried to run the following spider script to try crawling geekbuying listing data but ran into an error - SyntaxError: 'yield' outside function.
# -*- coding: utf-8 -*-
import scrapy
class FlashDealsSpider(scrapy.Spider):
name = 'flash_deals'
allowed_domains = ['www.geekbuying.com']
start_urls = ['https://www.geekbuying.com/deals/categorydeals']
def parse(self, response):
items = response.xpath("//div[@class='flash_li']")
for product in items:
product_name = product.xpath(".//a[@class='flash_li_link']/text()").get()
product_sale_price = product.xpath(" .//div[@class='flash_li_price']/span/text()").get()
product_org_price = product.xpath(".//div[@class='flash_li_price']/del/text()").get()
product_url = product.xpath(".//a[@class='flash_li_link']/@href").get()
discount = product.xpath(".//div[@class='category_li_off']/text()").get()
yield
{
'name': product_name,
'sale_price': product_sale_price,
'orginal_price': product_org_price,
'url': product_url,
'discount': discount
}
next_page = response.xpath("//a[@class='next']/@href").get()
if next_page:
yield response.follow(url=next_page, callback=self.parse)
Does anyone know how to resolve this syntax error?
Thanks in advance!
Solution
yield
is used inside a function to stop its execution temporarily pause its execution and shifts the control to the caller of the function.
First of all the code you posted is not producing the error your are talking about because when writing the question you actually indented the yeild
keyword inside the funciton body.
However your program will not run as intended as it is semantically wrong. You need to indent the yield
further inside the for
loop.
import scrapy
class FlashDealsSpider(scrapy.Spider):
name = 'flash_deals'
allowed_domains = ['www.geekbuying.com']
start_urls = ['https://www.geekbuying.com/deals/categorydeals']
def parse(self, response):
items = response.xpath("//div[@class='flash_li']")
for product in items:
product_name = product.xpath(".//a[@class='flash_li_link']/text()").get()
product_sale_price = product.xpath(" .//div[@class='flash_li_price']/span/text()").get()
product_org_price = product.xpath(".//div[@class='flash_li_price']/del/text()").get()
product_url = product.xpath(".//a[@class='flash_li_link']/@href").get()
discount = product.xpath(".//div[@class='category_li_off']/text()").get()
yield {
'name': product_name,
'sale_price': product_sale_price,
'orginal_price': product_org_price,
'url': product_url,
'discount': discount
}
next_page = response.xpath("//a[@class='next']/@href").get()
if next_page:
yield response.follow(url=next_page, callback=self.parse)
Answered By - asimhashmi
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.