Issue
Here is my pipelines.py (python3+scrapy1.4).
import urllib.request
class MoviePipeline(object):
def process_item(self, item, spider):
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0'}
req = urllib.request.Request(url=item['addr'],headers=headers)
res = urllib.request.urlopen(req)
file_name = '/tmp/'+item['name']+'.jpg'
print(file_name)
with open(file_name,'wb') as fp:
fp.write(res.read())
1.print(file_name) can't work
The print(item['name']) can print item'name in parse function of my movie.py.
Why print(item['name']) can't work in pipelines.py when to execute my spider with scrapy crawl movie?
2.Why no jpg file saved in /tmp directory
import urllib.request
addr = 'selected_from_crawled_url'
req = urllib.request.Request(url= addr)
res = urllib.request.urlopen(req)
file_name = "/tmp/test.jpg"
with open(file_name,'wb') as fp:
fp.write(res.read())
It is verified that above code snippet works fine,why same structure in pipelines can't work?
Solution
vim movie/settings.py
ITEM_PIPELINES = {
'movie.pipelines.MpviePipeline': 100,
}
Answered By - showkey
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.