Issue
basically my goal is to scrapy each product item page but I think my code is wrong and I don't know how to use other methods..
import scrapy
class AdamdentalSpider(scrapy.Spider):
name = "adamdental"
start_urls = [ "https://www.adamdental.com.au/search?ProductSearch=%25" ]
def parse(self, response):
products = response.css("div[data-role=product]")
for product in products:
title_item = products.css("span.widget-productlist-title a")[0]
url = title_item.attrib['href']
yield scrapy.Request(
url = self.start_urls[0] + url,
callback = self.parse_details
)
def parse_details(self, response):
main = response.css("div.product-detail-right")
yield{
"title": main.css("h1.widget-product-title::text"),
"sku": main.css("h4.subtitle::text"),
"price": main.css("span.item-price"),
"description": main.css("div.widget-product-field.info-group.widget-product-field-ProductDescription.description-gap"),
}
Solution
With single request and two responses along with two yield aren't not a correct way to pull data using scrapy.
import scrapy
class AdamdentalSpider(scrapy.Spider):
name = "adamdental"
start_urls = [ "https://www.adamdental.com.au/search?ProductSearch=%25" ]
def parse(self, response):
for link in response.css('span.widget-productlist-title'):
rel_url= link.css('a::attr(href)').get()
abs_url=f'https://www.adamdental.com.au{rel_url}'
yield scrapy.Request(
url=abs_url,
callback = self.parse_details
)
def parse_details(self, response):
yield {
"title": response.css("h1.widget-product-title::text").get(),
"sku": response.css("h4.subtitle::text").get(),
"price": response.css("span.item-price::text").get(),
"description": ''.join(response.xpath('//*[@class="info-group-content"]//text()').getall()).replace('\r\n','').strip()
}
Output:
{'title': 'Disposable Premium Air Water Triplex Syringe Tips 150/pk', 'sku': ' 103100W', 'price': '$31.00', 'description': '150/packMetal interior, plastic exteriorInterchangeable with most metal tips with no conversionDesign for snug locking fit'}
2022-05-26 18:29:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.adamdental.com.au/anthogyr-torq-control-universal-torque-wrench> (referer: https://www.adamdental.com.au/search?ProductSearch=%25)
2022-05-26 18:29:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.adamdental.com.au/anthogyr-torq-control-universal-torque-wrench>
{'title': 'Anthogyr Torq Control Universal Torque Wrench', 'sku': ' 15501', 'price': '$1210.00', 'description': 'Anthogyr products are special order items and therefore cannot be refunded, only exchanged for other Anthogyr products.Universal Torque Wrench Torq ControlThe success of the implant treatment\xa0depends on\xa0the precise tightening\xa0of the parts placed directly on the implant. A pre-stressed tightening of the screw will help avoid any risk of screw loosening. Also, high tightening torques may lead to screw fracture.A calibrated tightening can only be guaranteed through the use of a precision instrument offering a torque control system.The dynamometrical manual wrench *Torq Control®\xa0has been specially designed to meet those requirements.Universal torque wrench, recommended with any type of implantsAutomatic declutching
for optimum securityOptimized access in mouth thanks to the micro-head100° angulated micro-head for easy access in mouth (posterior areas)Perfect control of torque thanks to 7 torques values (10/15/20/25/30/32/35N.cm)Only 135 gr for a better freedom of movementOne piece design with smooth surface to limit bacterial retention'}
2022-05-26 18:29:55 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.adamdental.com.au/infection-control/protective-eyewear/face-shields-and-visors/eye-shield-refills-12pk> (referer: https://www.adamdental.com.au/search?ProductSearch=%25)
2022-05-26 18:29:55 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.adamdental.com.au/infection-control/protective-eyewear/face-shields-and-visors/eye-shield-refills-12pk>
{'title': 'Eye Shield Refills 12pk', 'sku': ' 18110', 'price': '$16.50', 'description': '12 Disposable Eye Shields'}
2022-05-26 18:29:56 [scrapy.core.engine] DEBUG: Crawled (200)
...so on
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.