Issue
Basic tables are fairly easy to scrape with Selenium. I am having trouble scraping tables with "_ngcontent" notations ("https://material.angular.io/components/table/overview"). I am trying to scrape it into a dataframe.
This is how far I got:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pandas as pd
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
PATH = "C:\chromedriver.exe"
driver = webdriver.Chrome(PATH)
URL = 'https://material.angular.io/components/table/overview'
driver.get(URL)
titles = driver.find_element(By.CSS_SELECTOR, '#table-basic > div > div.docs-example-viewer-body.ng-star-inserted > table-basic-example > table > thead')
print(titles.text)
I was only able to get an element with: 'No. Name Weight Symbol' But I am not able to iterate through it, and scrape the data.
Please assist
Solution
To grab the table data easily, you can use selenium with pandas
import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
table=driver.get('https://material.angular.io/components/table/overview')
driver.maximize_window()
table = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '(//table)[1]'))).get_attribute("outerHTML")
df = pd.read_html(table)[0]
print(df)
Output:
No. Name Weight Symbol
0 1 Hydrogen 1.0079 H
1 2 Helium 4.0026 He
2 3 Lithium 6.9410 Li
3 4 Beryllium 9.0122 Be
4 5 Boron 10.8110 B
5 6 Carbon 12.0107 C
6 7 Nitrogen 14.0067 N
7 8 Oxygen 15.9994 O
8 9 Fluorine 18.9984 F
9 10 Neon 20.1797 Ne
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.