Issue
I have a list of URLs in a .txt file that I would like to run using selenium.
Lets say that the file name is b.txt in it contains 2 urls (precisely formatted as below): https://www.google.com/,https://www.bing.com/,
What I am trying to do is to make selenium run both urls (from the .txt file), however it seems that every time the code reaches the "driver.get" line, the code fails.
url = open ('b.txt','r')
url_rpt = url.read().split(",")
options = Options()
options.add_argument('--headless')
options.add_argument('--disable-gpu')
driver = webdriver.Chrome(chrome_options=options)
for link in url_rpt:
driver.get(link)
driver.quit()
The result that I get when I run the code is
Traceback (most recent call last):
File "C:/Users/ASUS/PycharmProjects/XXXX/Test.py", line 22, in <module>
driver.get(link)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\webdriver.py", line 333, in get
self.execute(Command.GET, {'url': url})
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "C:\Users\ASUS\AppData\Local\Programs\Python\Python38\lib\site-
packages\selenium\webdriver\remote\errorhandler.py", line 242, in
check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid
argument
(Session info: headless chrome=79.0.3945.117)
Any suggestion on how to re-write the code?
Solution
This error message...
Traceback (most recent call last):
.
driver.get(link)
.
self.execute(Command.GET, {'url': url})
.
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument
(Session info: chrome=79.0.3945.117)
...implies that the url
passed as an argument to get()
was an argument was invalid.
I was able to reproduce the same Traceback when the text file containing the list of urls contains a space character after the seperator of the last url. Possibly a space character was present at the fag end of b.txt as https://www.google.com/,https://www.bing.com/,
.
Debugging
An ideal debugging approach would be to print the url_rpt
which would have revealed the space character as follows:
Code Block:
url = open ('url_list.txt','r') url_rpt = url.read().split(",") print(url_rpt)
Console Output:
['https://www.google.com/', 'https://www.bing.com/', ' ']
Solution
If you remove the space character from the end your own code would execute just perfecto:
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe')
url = open ('url_list.txt','r')
url_rpt = url.read().split(",")
print(url_rpt)
for link in url_rpt:
driver.get(link)
driver.quit()
Answered By - undetected Selenium
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.