Issue
import scrapy
from scrapy.loader import ItemLoader
class Summoner(scrapy.Item):
summ = scrapy.Field()
rank = scrapy.Field()
class myspider(scrapy.Spider):
name = 'spider1'
start_urls = ['https://www.op.gg/ranking/ladder/']
def parse(self, response):
sel = scrapy.Selector(response)
summoners = sel.xpath('//ul[@class="ranking-highest__list"]/ul')
for ran, summon in enumerate(summoners):
item = ItemLoader(Summoner(), summon)
item.add_xpath('summ', './/li/a/text()')
item.add_value('rank', ran)
yield item.load_item()
My intention is getting every summoners name from the leaderboard along with its rank (through enumerate
).
I then run scrapy runspider myscript.py -o results.xml
and spider stops out leaving a 0 byte .xml file.
No errors shown.
I have tried changing xpath from summoners
multiple times without any success.
Also, an additional question: Am I supposed to 'figure' xpath by myself like I attempted above, or I should just copy it from Inspect element? Doing so, I get something like this /html/body/div[2]/div[3]/div[3]/div/div/div/div[1]/ul
(which still doesn't work btw)
I'm sure my problem lays in xpath
, may you correct me?
Solution
Your xpaths were wrong. Try the following instead:
import scrapy
from scrapy.loader import ItemLoader
class Summoner(scrapy.Item):
summ = scrapy.Field()
rank = scrapy.Field()
class myspider(scrapy.Spider):
name = 'spider1'
start_urls = ['https://www.op.gg/ranking/ladder/']
def parse(self, response):
for summon in response.xpath('//ul[@class="ranking-highest__list"]/li[contains(@id,"summoner-")]'):
item = ItemLoader(Summoner(), summon)
item.add_xpath('summ', './/a[@class="ranking-highest__name"]/text()')
item.add_xpath('rank', './/*[@class="ranking-highest__rank"]/text()')
yield item.load_item()
Answered By - SMTH
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.