Issue
I want to use single scrapy project for multiple scraping i.e multiple spiders here is my folder structure.
where scraper 1 runs with command like `scrapy crawl scar
project
│
└───Spider
│ │---scraper1.py
│ │---scraper2.py
│---items.py
|---pipelines.py
|---settings.py
#scarper1.py
class FloorSheetSpider(scrapy.Spider):
name = "nepse"
allowed_domains = ['nl.indeed.com']
# start_urls = ['https://merolagani.com/CompanyDetail.aspx?symbol=AKPL']
def start_requests(self):
pass
def parse(self, response):
# other usecases
items = NepalLiveShareItem() # this is items 1
yeild items
This work fine for single spider but when I add another item class in scraper 2.py my items class for scraper1.py runs in pipeline any reason for this weird thing. ps I have used separte pipeline for separate spider and registerd in settings too.
Solution
Pipelines that are specified in the settings.py file runs for every spider that exists in the project.
If you want to run pipeline for a specific spider only, you need to specify it in the spider file.
For example:
class FloorSheetSpider(scrapy.Spider):
name = "nepse"
allowed_domains = ['nl.indeed.com']
start_urls = ['https://merolagani.com/CompanyDetail.aspx?symbol=AKPL']
custom_settings = {
'ITEM_PIPELINES': {
'your_project.your_pipeline': 400
}
def parse(self, response):
# other usecases
items = NepalLiveShareItem() # this is items 1
yeild items
Answered By - Nazmus Sakib
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.