Issue
I want to download excel file from NHL.com. If you visit this link you can see the excel file, which can be downloaded. But when I try to scrap this website with BeautifulSoup, there is no link for this file. How can I find it and download?)
Solution
The export button executes JavaScript function that reads the datatable and creates Excel file on-the-fly. beautifulsoup
cannot execute JavaScript.
But you can use their Ajax api to download the data in Json format, parse it and save it to CSV for example:
import requests
import pandas as pd
api_url = "https://api.nhle.com/stats/rest/en/team/summary"
params = {
"isAggregate": "true",
"isGame": "true",
"sort": '[{"property":"points","direction":"DESC"},{"property":"wins","direction":"DESC"},{"property":"franchiseId","direction":"ASC"}]',
"start": "0",
"limit": "50",
"factCayenneExp": "gamesPlayed>=1",
"cayenneExp": 'gameDate<="2022-10-23 23:59:59" and gameDate>="2022-10-23" and gameTypeId=2',
}
data = requests.get(api_url, params=params).json()
df = pd.DataFrame(data["data"])
print(df.to_markdown(index=False))
df.to_csv("data.csv", index=False)
Prints:
faceoffWinPct | franchiseId | franchiseName | gamesPlayed | goalsAgainst | goalsAgainstPerGame | goalsFor | goalsForPerGame | losses | otLosses | penaltyKillNetPct | penaltyKillPct | pointPct | points | powerPlayNetPct | powerPlayPct | regulationAndOtWins | shotsAgainstPerGame | shotsForPerGame | ties | wins | winsInRegulation | winsInShootout |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.694915 | 11 | Chicago Blackhawks | 1 | 4 | 4 | 5 | 5 | 0 | 0 | 1.25 | 1 | 1 | 2 | 0.666666 | 0.666666 | 1 | 34 | 27 | 1 | 1 | 0 | |
0.491803 | 12 | Detroit Red Wings | 1 | 1 | 1 | 5 | 5 | 0 | 0 | 1 | 1 | 1 | 2 | 0.6 | 0.6 | 1 | 33 | 41 | 1 | 1 | 0 | |
0.432835 | 29 | San Jose Sharks | 1 | 0 | 0 | 3 | 3 | 0 | 0 | 1 | 1 | 1 | 2 | 0 | 0 | 1 | 30 | 25 | 1 | 1 | 0 | |
0.509803 | 33 | Florida Panthers | 1 | 2 | 2 | 3 | 3 | 0 | 0 | 0.666666 | 0.666667 | 1 | 2 | 0 | 0 | 1 | 26 | 32 | 1 | 1 | 0 | |
0.333333 | 36 | Columbus Blue Jackets | 1 | 1 | 1 | 5 | 5 | 0 | 0 | 0.666666 | 0.666667 | 1 | 2 | 0 | 0 | 1 | 31 | 21 | 1 | 1 | 0 | |
0.666666 | 10 | New York Rangers | 1 | 5 | 5 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0.333333 | 0.333333 | 0 | 21 | 31 | 0 | 0 | 0 | |
0.567164 | 16 | Philadelphia Flyers | 1 | 3 | 3 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 25 | 30 | 0 | 0 | 0 | |
0.490196 | 22 | New York Islanders | 1 | 3 | 3 | 2 | 2 | 1 | 0 | 1 | 1 | 0 | 0 | 0.333333 | 0.333333 | 0 | 32 | 26 | 0 | 0 | 0 | |
0.508196 | 32 | Anaheim Ducks | 1 | 5 | 5 | 1 | 1 | 1 | 0 | 0.4 | 0.4 | 0 | 0 | 0 | 0 | 0 | 41 | 33 | 0 | 0 | 0 | |
0.305084 | 39 | Seattle Kraken | 1 | 5 | 5 | 4 | 4 | 1 | 0 | 0.333333 | 0.333334 | 0 | 0 | -0.25 | 0 | 0 | 27 | 34 | 0 | 0 | 0 |
and saves data.csv
(screenshot from LibreOffice):
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.