Issue
lets say i have this spider:
class ExampleSpider(scrapy.Spider):
name = 'ExampleSpider'
start_urls = []
def parse(self, response):
for res in response.css('div.example'):
item = {
'example' : res.css(examplehere)
}
yield item
Is there a way that i can have starturls = ["examplesite.com/{}/search"] then loop through my text file of words and format it so for example something like: starturls = ["examplesite.com/{}/search".format(i for i in txtfile.txt)] and this way it would scrape through all the urls for the words i have in the text file? Im not sure if this can be done in scrapy please let me know the best way.
Solution
This question was asked before.
Use start_reuqests method:
import scrapy
class ExampleSpider(scrapy.Spider):
name = 'ExampleSpider'
def start_requests(self):
with open('spiders/urlFile.txt', 'r') as f:
for line in f:
url = f"https://examplesite.com/{line.rstrip()}/search"
scrapy.Request(url=url)
def parse(self, response):
for res in response.css('div.example'):
item = {
'example': res.css('examplehere').get()
}
yield item
Answered By - SuperUser
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.