Issue
I'm trying to scrape a table from this website: https://www.cbc.ca/sports/basketball/cebl/broadcast. I checked and confirmed the table exists, here is a snippet of what the HTML code looks like: snippet of HTML
However, when I input this code to find the table element with beautiful soup, the output shows None.
from bs4 import BeautifulSoup
import requests
url = 'https://www.cbc.ca/sports/basketball/cebl/broadcast'
page = requests.get(url)
soup = BeautifulSoup(page.text,'lxml')
print(soup.find('table'))
Then, when I find the division that contains the table with...
url = 'https://www.cbc.ca/sports/basketball/cebl/broadcast'
page = requests.get(url)
soup = BeautifulSoup(page.text,'lxml')
soup.find('div',{'class':'schedulecanvas'})
it outputs
<div class="schedulecanvas"></div>
In the HTML code the table should be contained inside this tag and it isn't showing up in my program output. Please let me know why it is not finding the table element.
Solution
The data you see on the page is loaded from external URL via Javascript. To load it into a pandas DataFrame you can use next example:
import requests
import pandas as pd
url = "https://www.cbc.ca/sports-content/v11/includes/json/schedules/broadcast_schedule.json"
df = pd.DataFrame(requests.get(url).json()["schedule"])
print(df.head(3).to_markdown(index=False))
Prints:
stt | end | ti | url | on | typ | oly | id | nb | thumb |
---|---|---|---|---|---|---|---|---|---|
07/14/2022 08:00 EDT | 07/14/2022 10:00 EDT | FIVB Women's Volleyball Nations League Final - Quarter-final - Italy vs China | /1.6515907 | ['web'] | ['volleyball'] | [] | ff81a672-7139-4278-8416-e8261035ac89 | https://i.cbc.ca/1.6519911.1657893904!/httpImage/image.jpeg_gen/derivatives/16x9_300/image.jpeg | |
07/14/2022 11:30 EDT | 07/14/2022 13:30 EDT | FIVB Women's Volleyball Nations League Final - Quarter-final - Turkey vs Thailand | /1.6515907 | ['web'] | ['volleyball'] | [] | dd7d06f7-71f8-485e-86c7-5d2e9b074b9c | https://i.cbc.ca/1.6519911.1657893904!/httpImage/image.jpeg_gen/derivatives/16x9_300/image.jpeg | |
07/14/2022 19:00 EDT | 07/14/2022 21:00 EDT | Canadian Elite Basketball League: Scarborough Shooting Stars vs Hamilton Honey Badgers | /1.6511882 | ['web'] | ['basketball', 'CEBL'] | [] | 23469577-a854-4e0c-9a8e-369d0bf980a1 | http://i.cbc.ca/1.470240!/fileImage/httpImage/image.jpg_gen/derivatives/16x9_300/default-headline-image-sports.jpg |
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.