Issue
I know that using global variables is not a good idea and I plan to do something different. But, while playing around I ran into a strange global variable issue within Scrapy. In pure python, I don't see this problem.
When I run this bot code:
import scrapy
from tutorial.items import DmozItem
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["lib-web.org"]
start_urls = [
"http://www.lib-web.org/united-states/public-libraries/michigan/"
]
count = 0
def parse(self, response):
for sel in response.xpath('//div/div/div/ul/li'):
item = DmozItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('p/text()').extract()
global count;
count += 1
print count
yield item
DmozItem:
import scrapy
class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()
I get this error:
File "/Users/Admin/scpy_projs/tutorial/tutorial/spiders/dmoz_spider.py", line 22, in parse
count += 1
NameError: global name 'count' is not defined
But if I simply change 'count += 1' to just 'count = 1', it runs fine.
What's going on here? Why can I not increment the variable?
Again, if I run similar code outside of a Scrapy context, in pure Python, it runs fine. Here's the code:
count = 0
def doIt():
global count
for i in range(0, 10):
count +=1
doIt()
doIt()
print count
Resulting in:
Admin$ python count_test.py
20
Solution
count
is a class variable in your example, so you should access it using self.count
. It solves the error, but maybe what you really need is an instance variable, because as a class variable, count
is shared between all the instances of the class.
Assigning count = 1
inside the parse
method works because it creates a new local variable called count
, which is different from the class variable count
.
Your pure Python example works because you did not define a class, but a function instead, and the variable count
you created there has global scope, which is accessible from the function scope.
Answered By - Valdir Stumm Junior
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.