Issue
I'm trying to obtain the 2nd table on the following website. I've tried BS4, Pandas, and now selenium, but I can't obtain the table for the life of me.
The table data doesn't load until after the page comes up.
There is a dictionary on the 'view source' page with the info, but its seems like every element on that page is 'line-content', so its makes it difficult to obtain only the desired table info.
What is the best way to collect the table data?
from ast import Return
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as ec
import time
path = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(path)
driver.get('https://www.linestarapp.com/Ownership/Sport/NFL/Site/DraftKings/PID/295')
time.sleep(5)
driver.maximize_window()
time.sleep(5)
players = driver.find_elements_by_class_name('playerRow').text
print(players)
Solution
There are several issues here:
find_elements_by_class_name
method returns a list of web elements. You can not apply.text
method on a list of web elements, you need to iterate over elements in a list and extract text from each element one by one.- You should not use hardcoded pauses like
time.sleep(5)
Expected Conditions explicitly waits should be used instead.
Something like this:
from ast import Return
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
path = 'C:\Program Files (x86)\chromedriver.exe'
driver = webdriver.Chrome(path)
wait = WebDriverWait(driver, 20)
driver.get('https://www.linestarapp.com/Ownership/Sport/NFL/Site/DraftKings/PID/295')
driver.maximize_window()
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".playerRow")))
#short pause added in order to make sure all the elements are loaded after we know the first element was loaded
time.sleep(0.5)
players = driver.find_elements_by_class_name('playerRow')
for player in players:
print(player.text)
UPD
Using this css_selector locator #tableTournament .playerRow
instead of just .playerRow
will give you the right table rows
Answered By - Prophet
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.