Issue
I want to scrape a .net website, i make this code
from scrapy import Selector
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
class BoursakuwaitSpider(scrapy.Spider):
name = 'boursakuwait'
custom_settings = {
'FEED_URI': 'second.json',
'FEED_FORMAT': 'json',
}
start_urls = ['https://casierjudiciaire.justice.gov.ma/verification.aspx']
def parse(self, no_response):
browser = webdriver.Chrome(executable_path=ChromeDriverManager().install())
browser.get('https://casierjudiciaire.justice.gov.ma/verification.aspx')
time.sleep(10)
response = Selector(text=browser.page_source)
when i use the function parse the code does not work but if i use just the class like this :
import time
import scrapy
from scrapy import Selector
from selenium import webdriver
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
class BoursakuwaitSpider(scrapy.Spider):
name = 'boursakuwait'
custom_settings = {
'FEED_URI': 'second.json',
'FEED_FORMAT': 'json',
}
start_urls = ['https://casierjudiciaire.justice.gov.ma/verification.aspx']
browser = webdriver.Chrome(executable_path=ChromeDriverManager().install())
browser.get('https://casierjudiciaire.justice.gov.ma/verification.aspx')
time.sleep(10)
response = Selector(text=browser.page_source)
The code work correclty. But for me i want to use the function (the first code) i don't know where is the problem. please any help.
Solution
It's because the website of Moroccan's Ministry of Justice is so old that your programme can't handle it. According to this thread, you'll need to downgrade your cryptography
and pyOpenSSL
packages to handle the website:
pip install --upgrade cryptography==36.0.2
pip install --upgrade pyOpenSSL==22.0.0
If that somehow didn't work, then try installing all of the followings:
Scrapy : 2.6.1
lxml : 4.8.0.0
libxml2 : 2.9.4
cssselect : 1.1.0
parsel : 1.6.0
w3lib : 1.22.0
Twisted : 22.4.0
Answered By - Duc Nguyen
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.