Issue
I read several different examples/articles but I am still not sure about yield
and return
. I have two examples here from my code where I asked me that question. Which one to use?
spider.py
class LatindancecalendarSpider(CrawlSpider):
name = "latindancecalendar"
allowed_domains = ["latindancecalendar.com"]
start_urls = ["https://latindancecalendar.com/festivals/"]
rules = (
Rule(
LinkExtractor(
restrict_xpaths=("//div[@class='eventline event_details']/a")
),
callback="parse_event",
),
)
def parse_event(self, response):
# event = ItemLoader(item=LatinDanceCalendarItem(), response=response)
event = LatinDanceCalendarItemLoader(
item=LatinDanceCalendarItem(), response=response
)
# event.default_output_processor = TakeFirst()
event.add_xpath("name", '//h1[@class="page-title"]/text()')
event.add_xpath("date", '//div[@class="vevent"]/div/span/b/text()')
return event.load_item() # yield or return?
items.py
def lowercase_processor(self, values):
for v in values:
yield v.lower() # yield or return?
class LatinDanceCalendarItemLoader(ItemLoader):
default_output_processor = TakeFirst()
city_in = lowercase_processor
Solution
Yield will return a generator. The return will only return the first v
in values
and the rest of the loop is skipped. Basically if you use yield, you will get back a generator with all the values in lowercase. If you use a return it will just return the first value in lowercase.
Answered By - user12541086
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.