Issue
I'm trying to use Google Colab to webscrap a table from this website but when I run the code below I receive empty brackets.
import urllib.request as url
from bs4 import BeautifulSoup
page = f'https://www.stadiumgaming.gg/rank-checker?pokemon=Walrein'
html = url.urlopen(page)
soup = BeautifulSoup(HTML,'html5lib').findAll('td')
print(soup)
Output:
[]
How can I find the table on this page so that it can be parsed into a dataframe?
Solution
You can't find a table using Beautifulsoup ffrom this webpage because it's dinamically populated by JavaScript
and bs4 can't parse JS. but you can mimic bs4, pandas with selenium
import pandas as pd
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time
from selenium.webdriver.chrome.options import Options
webdriver_service = Service("./chromedriver") #Your chromedriver path
driver = webdriver.Chrome(service=webdriver_service)
url = 'https://www.stadiumgaming.gg/rank-checker?pokemon=WALREIN'
driver.get(url)
driver.maximize_window()
time.sleep(3)
table=BeautifulSoup(driver.page_source, 'lxml')
df = pd.read_html(str(table))[1]
print(df.iloc[1:,0:9])
Result:
Rank IVs CP Lvl % Atk Def Sta Prod
1 2072 10/10/10 1483 20 94.97 114.70 111.12 150 1911.8
2 1 0/12/15 1499 21 100.00 111.41 115.09 157 2013.1
3 2 0/13/14 1500 21 99.89 111.41 115.70 156 2010.9
4 3 0/13/13 1497 21 99.89 111.41 115.70 156 2010.9
5 4 0/14/12 1498 21 99.78 111.41 116.31 155 2008.6
6 5 1/14/10 1500 21 99.68 112.02 116.31 154 2006.6
7 6 0/15/11 1499 21 99.65 111.41 116.92 154 2006.1
8 7 0/15/10 1496 21 99.65 111.41 116.92 154 2006.1
9 8 1/15/8 1498 21 99.55 112.02 116.92 153 2004
10 9 3/15/15 1499 20.5 99.53 111.89 115.52 155 2003.5
11 10 1/10/15 1499 21 99.48 112.02 113.86 157 2002.6
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.