Issue
Trying to scrape following website, https://www.trollandtoad.com/magic-the-gathering/aether-revolt/10066, and it scrapes almost all the data perfectly but in certain situations where there are many sellers for a particular card and their is a button that says view more it will not get all the prices for the different sellers, even though all the needed data is in the html code whether I click view more or not. For example in the pictures below you will see the card before and after the view more button is clicked and it will scrape 7 of the 8 cards the only one is does not scrape is the Evo Merchant card for 7.99, the one that appears right after I click view more, but the two below it, paradise games for 2.98 and Evo merchant for 6.99, get scraped just fine so I do not know what is going on.
def parse(self, response):
for game in response.css('div.card > div.row'):
item = GameItem()
item["Card_Name"] = game.css("a.card-text::text").get()
for buying_option in game.css('div.buying-options-table div.row:not(:first-child)'):
item["Condition"] = buying_option.css("div.col-3.text-center.p-1::text").get()
item["Price"] = buying_option.css("div.col-2.text-center.p-1::text").get()
yield item
Solution
I think your problem lies in your CSS selector, specifically, the :not(:first-child)
part.
I haven't looked into the HTML carefully, but apparently the first item after the "View More" link is also considered a first child. So I would consider removing that table header some other way:
for buying_option in game.css('div.buying-options-table div.row')[1:]:
Answered By - Imperishable Night
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.