Issue
I am trying to scrape data from this website:
https://www.shanghairanking.com/rankings/grsssd/2021
Initially pandas gets me out the gates and I can scrape the table but I am struggling with the drop down menus. I want to select the options next to the total score box which are PUB, CIT, etc. When I inspect the element it looks like maybe Javascript and the usual methods of interating over these options don't work. I have tried Beutifalsoup and most recently Selenium to select the drop downs by hand. This works for the default table data '''
import time
import pandas as pd
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Chrome('/Users/martinbell/Downloads/chromedriver')
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
submit = driver.find_element_by_xpath("//input[@value='CIT']").click()
''' Doesn't get me anywhere.
Solution
Your code would not work as you first have to click the dropdown open and then traverse through the options in the dropdown. Here is the refactored code.
Note that I have used time.sleep
for instant purposes but for a robust code and good practice, use explicit wait such as WebdriverWait
driver.get('https://www.shanghairanking.com/rankings/grsssd/2021')
time.sleep(10)
driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
#The below commented code loops through all the dropdown options and performs actions.
# opt_ele = driver.find_elements(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li")
# for ele in opt_ele:
# print(ele.text)
# ele.click()
# print('perform your actions here')
# driver.find_element(By.XPATH, "(//*[@class='inputWrapper'])[3]").click()
# If you do not want to loop through but just want to select only CIT, here is the line:
driver.find_element(By.XPATH, "(//*[@class='rank-select'])[2]//*[@class='options']//li[text()='CIT']").click()
Answered By - Anand Gautam
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.