Issue
I'm trying to use this code to scroll down to the end of a page:
from selenium import webdriver
url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')
for index in range(len(tickers)):
print("Row " + tickers[index].text + " ")
But the while
loop never ends; Selenium continues to try and scroll downward even after it hits the bottom of the page, so the program doesn't proceed. How can I detect that the bottom of the page has been reached so that the code can continue?
Solution
Under the ticker, it tells you how many rows (matches) are in the table. So, one option is to compare the number of visible rows to the total number of rows. When you reach that number (of visible rows), you quit the loop.
url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)
# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")
selector = '.js-field-total.tv-screener-table__field-value--total'
matches = driver.find_element_by_css_selector(selector)
matches = int(matches.text.split()[0])
visible_rows = 0
scrolls = 0
while visible_rows < matches:
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
# Wait 10 scrolls before updating row information
if scrolls == 10:
table = driver.find_elements_by_class_name('tv-data-table__tbody')
visible_rows = len(table[1].find_elements_by_tag_name('tr'))
scrolls = 0
scrolls += 1
# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')
for index in range(len(tickers)):
print("Row " + tickers[index].text + " ")
Edit: Since your setup doesn't seem to allow the previous solution, here's a different approach you can try. The page loads 150 rows at a time. So, instead of counting the number of visible rows, we can use the total matches/rows we're expecting (e.g. 4894) and divide that by 150 to get the number of times we need to scroll. If we scroll at least that many times, in theory, all of the rows should be visible and we can continue with the code.
from time import sleep
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
url = 'http://www.tradingview.com/screener'
driver = webdriver.Chrome('./chromedriver')
driver.get(url)
try:
selector = '.js-field-total.tv-screener-table__field-value--total'
condition = EC.visibility_of_element_located((By.CSS_SELECTOR, selector))
matches = WebDriverWait(driver, 10).until(condition)
matches = int(matches.text.split()[0])
except (TimeoutException, Exception):
print ('Problem finding matches, setting default...')
matches = 4895 # Set default
# The page loads 150 rows at a time; divide matches by
# 150 to determine the number of times we need to scroll;
# add 5 extra scrolls just to be sure
num_loops = int(matches / 150 + 5)
for _ in range(num_loops):
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
sleep(2) # Pause briefly to allow loading time
# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')
n_tickers = len(tickers)
msg = 'Correct ' if n_tickers == matches else 'Incorrect '
msg += 'number of tickers ({}) found'
print(msg.format(n_tickers))
for index in range(n_tickers):
print("Row " + tickers[index].text + " ")
Answered By - T. Ray
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.