Issue
I want to scrape all links off specific games for specific week, which I can see through inspect, but it scrapes only links of next games, no matter, which page (gameweek) I try to scrape. https://www.euroleaguebasketball.net/euroleague/game-center/?round=1&season=E2021
soup.find_all('a', class_="game-card-view_linkWrap__u3Tea")
shows:
['/euroleague/game-center/2021-22/olympiacos-piraeus-anadolu-efes-istanbul/E2021/228/',
'/euroleague/game-center/2021-22/alba-berlin-zenit-st-petersburg/E2021/227/',
'/euroleague/game-center/2021-22/as-monaco-zalgiris-kaunas/E2021/226/',
'/euroleague/game-center/2021-22/maccabi-playtika-tel-aviv-cska-moscow/E2021/229/',
'/euroleague/game-center/2021-22/ax-armani-exchange-milan-bitci-baskonia-vitoria-gasteiz/E2021/230/',
'/euroleague/game-center/2021-22/unics-kazan-crvena-zvezda-mts-belgrade/E2021/231/',
'/euroleague/game-center/2021-22/fenerbahce-beko-istanbul-fc-bayern-munich/E2021/232/',
'/euroleague/game-center/2021-22/ldlc-asvel-villeurbanne-panathinaikos-opap-athens/E2021/233/',
'/euroleague/game-center/2021-22/real-madrid-fc-barcelona/E2021/234/']
but should be: links of game 1 - game 9.
Solution
What happens?
Content of website is generated dynamically and requests
could not interpret / render these like a browser can do.
How to fix?
Option#1:
Use the api to get the information of the matches:
URL = 'https://feeds.incrowdsports.com/provider/euroleague-feeds/v2/competitions/E/seasons/E2021/games?teamCode=&phaseTypeCode=RS&roundNumber=1'
headers = {
'accept':'*/*',
'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
}
r = requests.get(URL, headers=headers)
r.json()['data']
Option#2:
Use selenium to render the page like a browser will do and scrape data from driver.page_source
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.