Issue
I am having problem when trying to scrape https://www.bet365.com/ using urllib.request
and BeautifulSoup
.
The problem is, the code below doesn't get all the information on the page, for example players' names don't appear. Maybe another framework or configuration to extract the information?
My code is:
from bs4 import BeautifulSoup
import urllib.request
url = "https://www.bet365.com/"
try:
page = urllib.request.urlopen(url)
except:
print("An error occured.")
soup = BeautifulSoup(page, 'html.parser')
soup = str(soup)
Solution
Looking at the source code for the page in question it looks like essentially all of the data is populated by Javascript. BeautifulSoup isn't a headless client, it's just something that downloads and parses HTML, so anything that's populated with Javascript it can't see. You'd need a headless browser like selenium to scrape something like that.
Answered By - Daniel McLaury
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.