Issue
I want to scrape the data of the players table for my own personal use on this link: https://fbref.com/en/comps/9/stats/Premier-League-Stats
However, no matter how I try to navigate the parse tree, I can never seem to access the actual table statistics part of the html for the players.
Web page html for player stats
The id tag in the web page for the table is id="div_stats_standard". When I look for this in the soup in my Jupyter Notebook code using the code:
import requests
from bs4 import BeautifulSoup
url = "https://fbref.com/en/comps/9/stats/Premier-League-Stats"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
table = soup.find_all(id= "div_stats_standard")
print(table)
I get the output:
[]
Even stranger, when I scroll down through the soup in my code to the part where the tag exists in the web page html, it's not there?? I have it marked out where the id tag should be in the image below. Can anyone help me with this please?
Solution
This is how you can obtain the tables on that page (I imagine the one you're looking for is the last dataframe):
import pandas as pd
import requests
from bs4 import BeautifulSoup
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36"
}
url= 'https://fbref.com/en/comps/9/stats/Premier-League-Stats'
response = requests.get(url).text.replace('<!--', '').replace('-->', '')
dfs = pd.read_html(response)
for df in dfs:
print(df)
This will return the available tables on the page:
Squad # Pl Age Poss MP Starts Min 90s Gls Ast G-PK PK PKatt CrdY CrdR Gls Ast G+A G-PK G+A-PK xG npxG xA npxG+xA xG xA xG+xA npxG npxG+xA
0 Arsenal 16 24.6 51.0 3 33 270 3.0 8 6 8 0 0 4 0 2.67 2.00 4.67 2.67 4.67 5.4 5.4 3.1 8.5 1.80 1.04 2.83 1.80 2.83
1 Aston Villa 18 26.8 57.7 3 33 270 3.0 3 3 3 0 0 8 0 1.00 1.00 2.00 1.00 2.00 3.5 3.5 2.9 6.4 1.17 0.97 2.14 1.17 2.14
2 Bournemouth 18 26.2 36.3 3 33 270 3.0 2 1 2 0 0 8 0 0.67 0.33 1.00 0.67 1.00 0.9 0.9 0.5 1.5 0.30 0.18 0.48 0.30 0.48
3 Brentford 19 26.2 44.7 3 33 270 3.0 8 6 8 0 0 2 0 2.67 2.00 4.67 2.67 4.67 3.7 3.7 3.0 6.8 1.24 1.01 2.25 1.24 2.25
4 Brighton 17 28.0 47.7 3 33 270 3.0 4 2 3 1 1 3 0 1.33 0.67 2.00 1.00 1.67 2.6 2.6 1.9 4.5 0.86 0.64 1.50 0.86 1.50
5 Chelsea 17 28.1 62.3 3 33 270 3.0 3 2 2 1 1 8 1 1.00 0.67 1.67 0.67 1.33 3.1 2.3 1.9 4.3 1.02 0.64 1.66 0.78 1.42
[...]
Answered By - platipus_on_fire
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.