Issue
I am trying to extract timings from this page. I would like these timings in text form to do more processing.
I have tried this code:
elements = driver.find_elements(by=By.CLASS_NAME, value="lyr_timeCount")
for elem in elements:
times.append(elem.text)
using the selenium driver. However, elements is an empty list. I have also tried using the xPath with the same result. I have also tried this using beautiful soup with the same result.
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
times = soup.find_all('time', {'class': 'lyr_sqResTime'})
Both have resulted in empty lists. How can I extract the timing data from this webpage using either method?
Solution
The timing elements are within an <iframe>
so you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the visibility of all the desired elements.
You can use either of the following locator strategies:
Using CSS_SELECTOR:
driver.get('https://www.capmetro.org/planner/?language=en_US&P=SQ&input=Museum%20Station%20(SB),%20Stop%20ID%205866&start=yes&widget=1.0.0&') WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.CSS_SELECTOR,"iframe[title='Trip Planner']"))) print([my_elem.text for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "span.lyr_timeCount")))])
Using XPATH:
driver.get('https://www.capmetro.org/planner/?language=en_US&P=SQ&input=Museum%20Station%20(SB),%20Stop%20ID%205866&start=yes&widget=1.0.0&') WebDriverWait(driver, 20).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@title='Trip Planner']"))) print([my_elem.text for my_elem in WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[@class='lyr_timeCount ']")))])
Console Output:
['4 min\nMinutes\nconfirmed', '7 min\nMinutes', '16 min\nMinutes\nconfirmed', '20 min\nMinutes\nconfirmed', '21 min\nMinutes\nconfirmed', '23 min\nMinutes\nconfirmed', '37 min\nMinutes', '37 min\nMinutes', '39 min\nMinutes\nconfirmed', '42 min', '4:43 PM', '4:53 PM', '4:53 PM', '5:03 PM', '5:03 PM', '5:13 PM', '5:13 PM', '5:23 PM', '5:23 PM', '5:33 PM', '5:33 PM', '5:43 PM', '5:43 PM', '5:53 PM', '5:53 PM', '6:03 PM', '6:08 PM', '6:13 PM', '6:23 PM', '6:23 PM', '6:33 PM', '6:38 PM', '6:38 PM', '6:43 PM', '6:50 PM', '6:53 PM', '6:58 PM', '7:05 PM', '7:08 PM', '7:13 PM']
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
Answered By - undetected Selenium
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.