Issue
I'm trying to run my script without the command "scrapy crawl...", I'm following this documentation https://docs.scrapy.org/en/latest/topics/practices.html#run-scrapy-from-a-script, but my code is not working. Would appreciate the help!
import scrapy
from scrapy.crawler import CrawlerProcess
class misbeneficiosSpider(scrapy.Spider):
name = 'misbeneficios'
start_urls = ['https://productos.misbeneficios.com.uy/tv-y-audio',
'https://productos.misbeneficios.com.uy/tv-y-audio?p=2']
def parse(self, response):
for products in response.css('div.product-item-info'):
yield {
'name': products.css('a.product-item-link::text').get(),
'price': products.css('span.price::text').get().replace('U$S\xa0', '')#[:-3].upper()
}
next_page = response.css('a.action.next').attrib['href']
if next_page is not None:
yield response.follow(next_page, callback=self.parse)
process = CrawlerProcess(settings={
"FEEDS": {
"items.csv": {"format": "csv"},
},
})
process.crawl(misbeneficiosSpider)
process.start()
This is the error output I'm seeing:
2022-11-09 00:02:08 [scrapy.utils.log] INFO: Scrapy 2.7.1 started (bot: scrapybot)
2022-11-09 00:02:08 [scrapy.utils.log] INFO: Versions: lxml 4.9.1.0, libxml2 2.9.12, cssselect 1.2.0, parsel 1.7.0, w3lib 2.0.1, Twisted 22.10.0, Python 3.8.5 (tags/v3.8.5:580fbb0, Jul 20 2020, 15:43:08) [MSC v.1926 32 bit (Intel)], pyOpenSSL 22.1.0 (OpenSSL 3.0.7 1 Nov 2022), cryptography 38.0.3, Platform Windows-10-10.0.22000-SP0
2022-11-09 00:02:08 [scrapy.crawler] INFO: Overridden settings:
{}
2022-11-09 00:02:08 [py.warnings] WARNING: C:\Users\cabre\AppData\Local\Programs\Python\Python38-32\lib\site-packages\scrapy\utils\request.py:231: ScrapyDeprecationWarning: '2.6' is a deprecated value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting.
It is also the default value. In other words, it is normal to get this warning if you have not defined a value for the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting. This is so for backward compatibility reasons, but it will change in a future version of Scrapy.
See the documentation of the 'REQUEST_FINGERPRINTER_IMPLEMENTATION' setting for information on how to handle this deprecation.
return cls(crawler)
2022-11-09 00:02:08 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.selectreactor.SelectReactor
2022-11-09 00:02:08 [scrapy.extensions.telnet] INFO: Telnet Password: 034a39dcf0704cf3
2022-11-09 00:02:08 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.feedexport.FeedExporter',
'scrapy.extensions.logstats.LogStats']
2022-11-09 00:02:09 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Among others that I cant post
Solution
It looks like with some minor error checking your code would work fine.
BTW there are two span.price
tags per product card and I wasn't sure which you wanted. So I think I just specified the first one.
For example:
import scrapy
from scrapy.crawler import CrawlerProcess
class misbeneficiosSpider(scrapy.Spider):
name = 'misbeneficios'
start_urls = ['https://productos.misbeneficios.com.uy/tv-y-audio']
def parse(self, response):
for products in response.css('div.product-item-info'):
price = products.css('span.price-wrapper span.price::text').get()
name = products.css('a.product-item-link::text').get()
if price and name:
yield {'name': name.strip(),
'price': price.strip('U$S\xa0').strip()}
next_page = response.css('a.next')
if next_page:
yield response.follow(next_page.attrib['href'], callback=self.parse)
process = CrawlerProcess(
settings={"FEEDS": {"items.csv": {"format": "csv"}}}
)
process.crawl(misbeneficiosSpider)
process.start()
OUTPUT : items.csv
name,price
"Smart TV LG 65"" UHD AI 65UP7750PSB","849,00"
"Smart TV Samsung 85"" Neo QLED UHD 4K","4.599,00"
"Smart TV LG 55"" OLED OLED55C2PSA","2.539,00"
"Smart TV LED Samsung 75"" UHD 4K UN75BU8000","1.999,00"
"Smart TV LG 32"" HD AI 32LM637BPSB","259,00"
"Smart TV TCL QLED 4K 55"" 55C825","1.599,00"
Smart TV Samsung Frame 55” UHD 4K SAQN55LS03BA,"1.679,00"
"Smart TV LED Smartlife 43"" SL-TV43FHDNXK","325,00"
"Smart TV Hometech 50"" UHD","339,00"
"Smart TV Hometech 55"" UHD","279,00"
"Smart TV Hometech 40"" FHD","275,00"
Barra de Sonido LG SP8A,"869,00"
"Smart TV LED 39"" HD Smartlife SL-TV39HDNX","265,00"
Smart TV LG OLED 48'' OLED48A1PSA,"1.359,00"
"Smart TV LG UHD 4K 60"" 60UQ8050PSB Al","989,00"
"Smart TV LG UHD 4K 65"" 65UQ8050PSB Al","1.275,00"
"Smart TV LG OLED 4K 65"" OLED65C2PSA AI","4.829,00"
"Smart TV LG 48"" OLED OLED48C2PSA","2.049,00"
"Smart TV LG 43"" NANOCELL 43NANO75SQA","549,00"
Proyector Samsung The Freestyle,"1.499,00"
"Smart TV Sony KD-75X80J 75""","1.999,00"
Parlante inalámbrico con Bluetooth Sony SRS-XB13,"59,00"
"Smart TV LG UHD 50"" 50UP7500PSF","539,00"
"Smart TV Philips 58"" 4K 58PUD6654/55 Borderless","899,00"
"Smart TV LG 43"" UHD 43UP7500PSF","465,00"
"Smart TV LG 86"" 4K 86UN8000PSB","3.990,00"
Parlante portátil Sony SRS-XB33,"159,00"
"Smart TV LG 75"" UHD AI 75UP7750PSB","2.095,00"
Equipo de audio de alta potencia V73D Sony MHC-V73D,"949,00"
Parlante profesional Xion Activo,"470,00"
Torre de sonido Sony MHC-V13,"359,00"
Barra de sonido Xion XI-BAR40,"55,00"
"Smart TV TCL 55"" UHD 55P615","579,00"
"Smart TV LG 65"" 4K OLED AI OLED65CXPSA","4.199,00"
"Smart TV TCL 50"" UHD 50P615","529,00"
"Smart TV Samsung 43"" FHD UN43T5300","359,00"
"Smart TV Philips 65"" 4K 65PUD6794/55 Ambilight","1.235,00"
"Smart TV LG OLED 77"" UHD AI OLED77G1","9.790,00"
"Smart TV Philips 32"" HD 32PHD6825/55","211,00"
"Smart TV LG 75"" 8K AI 75NANO95SNA","6.639,00"
"Smart TV Smartlife 50"" LED 4K","465,00"
"Smart TV Sony 55"" 4K KD-55X80J","999,00"
"Smart TV TCL 40"" FHD 40S65A","315,00"
Barra de sonido Sony HT-S100,"185,00"
Barra de Sonido LG SK1D,"189,00"
Parlante Sony SRS-XB23 Negro,"115,00"
Auriculares Apple Airpods 2 con estuche de carga,"166,00"
Auriculares Sony WH-CH510 Negro,"39,00"
Auriculares Sony WH-CH510 Blanco,"39,00"
Auriculares Sony WH-CH510 Azul,"39,00"
Torre de Sonido LG RN9 Xboom,"785,00"
"Smart TV LG 77"" 4K OLED AI OLED77GXPSA","9.190,00"
Torre de sonido LG RN7 Xboom,"559,00"
Torre de sonido LG RN5 Xboom,"345,00"
Parlante Samsung Giga Audio Party MX-T40,"325,00"
Parlante Energy Sistem Fabric Box 1+ grape,"27,00"
Parlante Energy Sistem Fabric Box 1+ blueberry,"25,00"
Parlante Sony MHC-V02,"279,00"
Parlante Billboard Bb730bt,"13,00"
Parlante Sony SRS-XB12 BT,"59,00"
Minicomponente LG Xboom CK99,"1.189,00"
Minicomponente LG Xboom CJ45,"299,00"
Minicomponente LG Xboom CM4360,"199,00"
Auricular SONY MDR-ZX110 Blanco,"22,00"
Auriculares Sony MDR-AS210 Blanco,"18,00"
Barra de sonido Xion XI-BAR70,"115,00"
Auriculares Sony Mdr-E9LP,"9,00"
Parlante Xion profesional XI-SD8BAT,"60,00"
Auricuares Sony MDR-ZX110 negro,"20,00"
Minicomponente LG XBOOM OL45,"315,00"
Auriculares Huawei Freebuds 4I Otter-Ct030,"93,00"
Minicomponente LG XBOOM CL88,"759,00"
Auriculares Sony MDR-EX15LP Rosado,"11,00"
Auriculares Sony MDR-EX15LP Azul,"11,00"
Auriculares Sony MDR-EX15LP negro,"11,00"
Auriculares Sony MDR-EX15LP Blanco,"11,00"
Auriculares Sony MDR-EX15LP violeta,"11,00"
"Smart TV Samsung 32"" UN32T4310 HD","265,00"
Answered By - Alexander
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.