Issue
I would like to have Selenium run a headless instance of Google Chrome to mine data from certain websites without the UI overhead. I downloaded the ChromeDriver executable from here and copied it to my current scripting directory.
The driver appears to work fine with Selenium and is able to browse automatically, however I cannot seem to find the headless option. Most online examples of using Selenium with headless Chrome go something along the lines of:
import os
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.binary_location = '/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary'`
driver = webdriver.Chrome(executable_path=os.path.abspath(“chromedriver"), chrome_options=chrome_options)
driver.get("http://www.duo.com")`
However when I inspect the possible arguments for the Selenium WebDriver using the command chromedriver -h
this is what I get:
D:\Jobs\scripts>chromedriver -h
Usage: chromedriver [OPTIONS]
Options
--port=PORT port to listen on
--adb-port=PORT adb server port
--log-path=FILE write server log to file instead of stderr, increases log level to INFO
--log-level=LEVEL set log level: ALL, DEBUG, INFO, WARNING, SEVERE, OFF
--verbose log verbosely (equivalent to --log-level=ALL)
--silent log nothing (equivalent to --log-level=OFF)
--append-log append log file instead of rewriting
--replayable (experimental) log verbosely and don't truncate long strings so that the log can be replayed.
--version print the version number and exit
--url-base base URL path prefix for commands, e.g. wd/url
--whitelisted-ips comma-separated whitelist of remote IP addresses which are allowed to connect to ChromeDriver
No --headless
option is available.
Does the ChromeDriver obtained from the link above allow for headless browsing?
Solution
--headless
is not argument for chromedriver
but for Chrome
. --headless
Run chrome in headless mode, i.e., without a UI or display server dependencies. ChromeDriver is a separate executable that WebDriver uses to control Chrome and Webdriver is a a collection of language specific bindings to drive a browser.
I am able to run in headless mode with this set of options. I hope this will help:
from bs4 import BeautifulSoup, NavigableString
from selenium.webdriver.chrome.options import Options
from selenium import webdriver
import requests
import re
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-gpu')
browser = webdriver.Chrome(chrome_options=options) # see edit for recent code change.
browser.implicitly_wait(20)
Update 12 Aug 2019:
old : browser = webdriver.Chrome(chrome_options=options)
new : browser = webdriver.Chrome(options=options)
Answered By - ShivaGaire
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.