Issue
I am trying to parse a specific webpage and extract the results from a keyword search. This is the webpage I want to be able to extract the results in a list (like ["Q9NXG6","Q9H6G9" etc].
I am unable to find the exact locator/syntax to do this:
For any find_elements(class_name, xpath etc), I either get
InvalidSelectorException: invalid selector: An invalid or illegal selector was specified
(or)
selenium.webdriver.remote.webelement.WebElement
This is the code so far:
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
page = "https://www.uniprot.org/"
web = webdriver.Chrome()
web.get(page)
search_protein = "prolyl 4 hydroxylase"
search_tab = web.find_element(By.CSS_SELECTOR, "#root > div.N8ovH > div > main > div > div.hero-header__content > div > section > form > div.main-search__input-container > input[type=search]")
search_tab.send_keys(search_protein)
search_tab.send_keys(Keys.RETURN)
#print (web.current_url)
print (web.find_elements(?))
Solution
Since you referred to just one page, I removed the search method and instead included it in the url as follows:
https://www.uniprot.org/uniprotkb?query=prolyl+4+hydroxylase
If you need the program to use other search terms, remove the query and reimplement searching. As for getting the list, every element has class=BqBnJ
so you can use the following method:
import selenium
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
page = "https://www.uniprot.org/uniprotkb?query=prolyl+4+hydroxylase"
web = webdriver.Chrome()
web.get(page)
# let page load
time.sleep(5)
# gets elements with class 'BqBnJ'
elements = web.find_elements(By.CLASS_NAME, "BqBnJ")
for element in elements:
print(element.text)
web.close()
This will produce the following list:
Answered By - thetaco
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.