Issue
I am running a Selenium web scraper on an EC2 instance (Amazon Linux) as root user. This is my initialization setup:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--disable-application-cache')
options.add_argument('--headless')
options.add_argument("--no-sand box")
options.add_argument("--window-size=1920,1080")
service = Service()
driver = webdriver.Chrome(service=service, options=options)
... scraping code ...
driver.quit()
The versions of ChromeDriver and Chrome match:
chromedriver --version
ChromeDriver 121.0.6167.85
google-chrome --version
Google Chrome 121.0.6167.85
I get the following error message:
Traceback (most recent call last):
File "/home/ec2-user/my-project/scripts/save.py", line 18, in get_and_save_new_tokens
new_tokens_raw = scrape_taptools_projects(url)
File "/home/ec2-user/my-project/scripts/taptools_scraper.py", line 99, in scrape_taptools_projects
driver = webdriver.Chrome(service=service, options=options)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
super().__init__(
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chromium/webdriver.py", line 61, in __init__
super().__init__(command_executor=executor, options=options)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 208, in __init__
self.start_session(capabilities)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 292, in start_session
response = self.execute(Command.NEW_SESSION, caps)["value"]
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: Chrome failed to start: exited normally.
(session not created: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Locally, the code works as intended. Strangely, the code also worked on the EC2 instance up until yesterday. I did not change any settings.
Can you please help me to fix this issue?
I already tried the following:
- I restarted the instance
- I reinstalled google-chrome (121.0.6167.85) on the instance and manually downloaded the aligning version of Chromedriver (121.0.6167.85)
- I rechecked if the code works locally and it did
- I changed the line
service = Service()
driver = webdriver.Chrome(service=service, options=options)
to
driver = webdriver.Chrome(options=options, service=Service(ChromeDriverManager().install()))
Both resulted in the same error message
Update In the meantime I found following fix suggestion: https://bugs.chromium.org/p/chromedriver/issues/detail?id=4403#c35
Adding
options.add_argument('--remote-debugging-pipe')
results in the new error message:
Traceback (most recent call last):
File "/home/ec2-user/my-project/scripts/save.py", line 18, in get_and_save_new_tokens
new_tokens_raw = scrape_taptools_projects(url)
File "/home/ec2-user/my-project/scripts/taptools_scraper.py", line 101, in scrape_taptools_projects
driver = webdriver.Chrome(service=service, options=options)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in __init__
super().__init__(
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/chromium/webdriver.py", line 61, in __init__
super().__init__(command_executor=executor, options=options)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 208, in __init__
self.start_session(capabilities)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 292, in start_session
response = self.execute(Command.NEW_SESSION, caps)["value"]
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 347, in execute
self.error_handler.check_response(response)
File "/usr/local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 229, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.SessionNotCreatedException: Message: session not created: Chrome failed to start: exited abnormally.
(timeout: Timed out receiving message from renderer: 60,000)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Solution
That's embarrassing: I must have accidentally let a space slip into
options.add_argument("--no-sand box")
The right way to spell it is
options.add_argument("--no-sandbox")
Answered By - user406139
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.