Issue
I'm a newbie seeking help. I've tried without success with the following.
from bs4 import BeautifulSoup
import pandas as pd
url = "https://www.canada.ca/en/immigration-refugees-citizenship/corporate/mandate/policies-operational-instructions-agreements/ministerial-instructions/express-entry-rounds.html"
html_text = requests.get(url).text
soup = BeautifulSoup(html_text, 'html.parser')
data = []
# Verifying tables and their classes
print('Classes of each table:')
for table in soup.find_all('table'):
print(table.get('class'))
Result: ['table'] None
Can anyone help me with how to get this data? Thank you so much.
Solution
The data you see on the page is loaded from external URL. To load the data you can use next example:
import requests
import pandas as pd
url = "https://www.canada.ca/content/dam/ircc/documents/json/ee_rounds_123_en.json"
data = requests.get(url).json()
df = pd.DataFrame(data["rounds"])
df = df.drop(columns=["drawNumberURL", "DrawText1", "mitext"])
print(df.head(10).to_markdown(index=False))
Prints:
drawNumber | drawDate | drawDateFull | drawName | drawSize | drawCRS | drawText2 | drawDateTime | drawCutOff | drawDistributionAsOn | dd1 | dd2 | dd3 | dd4 | dd5 | dd6 | dd7 | dd8 | dd9 | dd10 | dd11 | dd12 | dd13 | dd14 | dd15 | dd16 | dd17 | dd18 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
231 | 2022-09-14 | September 14, 2022 | No Program Specified | 3,250 | 510 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | September 14, 2022 at 13:29:26 UTC | January 08, 2022 at 10:24:52 UTC | September 12, 2022 | 408 | 6,228 | 63,860 | 5,845 | 9,505 | 19,156 | 16,541 | 12,813 | 58,019 | 12,245 | 12,635 | 9,767 | 11,186 | 12,186 | 68,857 | 35,833 | 5,068 | 238,273 |
230 | 2022-08-31 | August 31, 2022 | No Program Specified | 2,750 | 516 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 31, 2022 at 13:55:23 UTC | April 16, 2022 at 18:24:41 UTC | August 29, 2022 | 466 | 7,224 | 63,270 | 5,554 | 9,242 | 19,033 | 16,476 | 12,965 | 58,141 | 12,287 | 12,758 | 9,796 | 11,105 | 12,195 | 68,974 | 36,001 | 5,120 | 239,196 |
229 | 2022-08-17 | August 17, 2022 | No Program Specified | 2,250 | 525 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 17, 2022 at 13:43:47 UTC | December 28, 2021 at 11:03:15 UTC | August 15, 2022 | 538 | 8,221 | 62,753 | 5,435 | 9,129 | 18,831 | 16,465 | 12,893 | 58,113 | 12,200 | 12,721 | 9,801 | 11,138 | 12,253 | 68,440 | 35,745 | 5,137 | 238,947 |
228 | 2022-08-03 | August 3, 2022 | No Program Specified | 2,000 | 533 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | August 03, 2022 at 15:16:24 UTC | January 06, 2022 at 14:29:50 UTC | August 2, 2022 | 640 | 8,975 | 62,330 | 5,343 | 9,044 | 18,747 | 16,413 | 12,783 | 57,987 | 12,101 | 12,705 | 9,747 | 11,117 | 12,317 | 68,325 | 35,522 | 5,145 | 238,924 |
227 | 2022-07-20 | July 20, 2022 | No Program Specified | 1,750 | 542 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | July 20, 2022 at 16:32:49 UTC | December 30, 2021 at 15:29:35 UTC | July 18, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
226 | 2022-07-06 | July 6, 2022 | No Program Specified | 1,500 | 557 | Federal Skilled Worker, Canadian Experience Class, Federal Skilled Trades and Provincial Nominee Program | July 6, 2022 at 14:34:34 UTC | November 13, 2021 at 02:20:46 UTC | July 11, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
225 | 2022-06-22 | June 22, 2022 | Provincial Nominee Program | 636 | 752 | Provincial Nominee Program | June 22, 2022 at 14:13:57 UTC | April 19, 2022 at 13:45:45 UTC | June 20, 2022 | 664 | 8,017 | 55,917 | 4,246 | 7,845 | 16,969 | 15,123 | 11,734 | 53,094 | 10,951 | 11,621 | 8,800 | 10,325 | 11,397 | 64,478 | 33,585 | 4,919 | 220,674 |
224 | 2022-06-08 | June 8, 2022 | Provincial Nominee Program | 932 | 796 | Provincial Nominee Program | June 08, 2022 at 14:03:28 UTC | October 18, 2021 at 17:13:17 UTC | June 6, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
223 | 2022-05-25 | May 25, 2022 | Provincial Nominee Program | 590 | 741 | Provincial Nominee Program | May 25, 2022 at 13:21:23 UTC | February 02, 2022 at 12:29:53 UTC | May 23, 2022 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
222 | 2022-05-11 | May 11, 2022 | Provincial Nominee Program | 545 | 753 | Provincial Nominee Program | May 11, 2022 at 14:08:07 UTC | December 15, 2021 at 20:32:57 UTC | May 9, 2022 | 635 | 7,193 | 52,684 | 3,749 | 7,237 | 16,027 | 14,466 | 11,205 | 50,811 | 10,484 | 11,030 | 8,393 | 9,945 | 10,959 | 62,341 | 32,590 | 4,839 | 211,093 |
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.