Issue
I do not understand why my spider wont run. I tested the css selector separately, so I do not think it is the parsing method.
Traceback message: ReactorNotRestartable:
class espn_spider(scrapy.Spider):
name = "fsu2021_spider"
def start_requests(self):
urls = "https://www.espn.com/college-football/team/_/id/52"
for url in urls:
yield scrapy.Request(url = url, callback = self.parse_front)
def parse(self, response):
schedule_link = response.css('div.global-nav-container li > a::attr(href)')
process = CrawlerProcess()
process.crawl(espn_spider)
process.start()
Solution
urls = "https://www.espn.com/college-football/team/_/id/52" for url in urls:
You're going through the characters of "urls", change it to a list:
urls = ["https://www.espn.com/college-football/team/_/id/52"]
...
...
Also you don't have "parse_front" function, if you just didn't add it to the snippet then ignore this, if it was a mistake then change it to:
yield scrapy.Request(url=url, callback=self.parse)
Answered By - SuperUser
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.