Friday, December 8, 2023

[FIXED] Getting allways the same values when iterating using Selenium?

December 08, 2023 beautifulsoup, python, selenium-webdriver No comments

Issue

i try to collect some data using the following code:

import time
import os
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

if __name__ == '__main__':
  
  print(f"Checking Browser driver...")
  os.environ['WDM_LOG'] = '0' 
  options = Options()
  options.add_argument("start-maximized")
  options.add_experimental_option("prefs", {"profile.default_content_setting_values.notifications": 1})    
  options.add_experimental_option("excludeSwitches", ["enable-automation"])
  options.add_experimental_option('excludeSwitches', ['enable-logging'])
  options.add_experimental_option('useAutomationExtension', False)
  options.add_argument('--disable-blink-features=AutomationControlled') 
  srv=Service()
  driver = webdriver.Chrome (service=srv, options=options)      
  waitWD = WebDriverWait (driver, 10)         
  
  baseLink = "https://residentialprotection.alberta.ca/public-registry/Property"
  print(f"Working for {baseLink}")  
  driver.get (baseLink)     
  waitWD.until(EC.presence_of_element_located((By.XPATH,'//input[@aria-owns="Municipality_listbox"]'))).send_keys("Calgary")
  waitWD.until(EC.presence_of_element_located((By.XPATH, '//button[@id="show-results"]'))).click() 
  time.sleep(5) 
  countElems = driver.find_elements(By.XPATH,'//tbody//tr[@role="row"]')
  print(len(countElems))
  for idx in range(len(countElems)):
    time.sleep(3)    
    elems = driver.find_elements(By.XPATH,'//tbody//tr[@role="row"]')    
    elems[idx].click()
    time.sleep(3)
    soup = BeautifulSoup (driver.page_source, 'lxml')  
    worker = soup.find("label", {"for": "FileNumber"})
    wFileNumber = worker.find_next("td").text.strip()
    print(f"{idx}: {wFileNumber}")
    closeElem = driver.find_elements(By.XPATH,'//a[@aria-label="Close"]')[-1]
    closeElem.click()
  driver.quit()

When you check the opened chrome-window it is opening all these 10 rows line by line and then parsing using bs4 for the file-number of this file. But in the output i allways get the same file-number 10 times (the file-number from the first row)

$ python temp2.py
Checking Browser driver...
Working for https://residentialprotection.alberta.ca/public-registry/Property
10
0: 21RU3557182
1: 21RU3557182
2: 21RU3557182
3: 21RU3557182
4: 21RU3557182
5: 21RU3557182
6: 21RU3557182
7: 21RU3557182
8: 21RU3557182
9: 21RU3557182
(selenium)

Why do i not get the different file-numbers like i can see it in the opened chrome-browser from selenium?

Solution

Suggestion of @Andrej Kesely is good and quicker than using Selenium.

Anyway, if you want to know what is the issue in your code, on your resource when you are clicking on the item and table is opened, it is left in the DOM.

find method gets only first element for label FileNumber, so it would be always the label for first opened table.

To fix your code, you need always get last element, using find_all - it would be always last opened table.

soup = BeautifulSoup(driver.page_source, 'lxml')
worker = soup.find_all("label", {"for": "FileNumber"})[-1]

Answered By - Yaroslavm

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, December 8, 2023

[FIXED] Getting allways the same values when iterating using Selenium?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels