Issue
I created a bunch of scrapers for this website. It has 39 pages to scrape, and I divided them by 1-10, 11-20, 21-30, and 31-39.
Here's my code for reaching those pages:
driver = webdriver.Chrome()
driver.get("https://www.forensicnurses.org/search/custom.asp?id=2100")
# Search in the USA
select = Select(driver.find_element_by_xpath('//*[@id="txt_country"]'))
select.select_by_visible_text('United States')
search_button = driver.find_element_by_xpath('//*[@id="main"]/table/tbody/tr[4]/td[2]/input')
search_button.click()
driver.implicitly_wait(8)
driver.switch_to.frame('SearchResultsFrame')
print('getting 11 to 20')
driver.implicitly_wait(8)
tab = driver.find_element_by_xpath('//*[@id="SearchResultsGrid"]/tbody/tr[26]/td/a[10]')
tab.click()
print('21-30')
driver.implicitly_wait(8)
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="SearchResultsGrid"]/tbody/tr[26]/td/a[11]')))
element.click();
print('31-39')
driver.implicitly_wait(20)
element = WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, '//*[@id="SearchResultsGrid"]/tbody/tr[26]/td/a[11]')))
element.click();
It never reaches 31-39. I'm able to make it reach 21-30 sometimes, but it usually stops at 11-20.
I get an error,
.NoSuchWindowException: Message: no such window
So is it just a case of luck? Or am I doing anything wrong?
Solution
Instead of that, try clicking on the >
a tag and due to a request error, 429 Too many requests, use time.sleep().
driver.get("https://www.forensicnurses.org/search/custom.asp?id=2100")
#Search in the USA
select = Select(driver.find_element_by_xpath('//*[@id="txt_country"]'))
select.select_by_visible_text('United States')
search_button = driver.find_element_by_xpath('//*[@id="main"]/table/tbody/tr[4]/td[2]/input')
search_button.click()
driver.implicitly_wait(5)
driver.switch_to.frame('SearchResultsFrame')
print('getting 1 to 10')
element = WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#SearchResultsGrid > tbody > tr.DotNetPager > td > a:nth-last-of-type(1)')))
element.click()
time.sleep(10)
print('11-20')
*Repeat*
print('21-30')
*Repeat*
print('31-39')
*Repeat*
Another way would be this with an extra check for the last page:
while True:
try:
element=WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, '#SearchResultsGrid > tbody > tr.DotNetPager > td > a:nth-last-of-type(1)')))
element.click()
time.sleep(30)
except Exception as e:
break
Import
import time
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Answered By - Arundeep Chohan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.