Issue
I'm trying to get the wins from the 'Overall match stats' on this page: https://www.fctables.com/teams/sunderland-194998/?template_id=11. Everything I try it just returns 'None'. This isn't the only page I have tried to use but every one seems to return 'None'. I'm not very advanced in this so any help would be appreciated.
from bs4 import BeautifulSoup
import requests
URL = "https://www.fctables.com/teams/sunderland-194998/"
response = requests.get(URL)
soup = BeautifulSoup(response.text, "html.parser")
wins = soup.find('div', class_='text-success ')
print(wins)
I need it to output the '6' which is the number of wins. Preferably as an integer.
Solution
BeautifulSoup famously is the package that lets you parse other people's HTML garbage as if it were grammatically correct. The HTML grammar is, alas, a bit complicated.
You got hung up on the trailing SPACE in the class name. Just strip it.
>>> from pprint import pp
>>>
>>> pp(soup.find_all('div', class_='text-success '))
[]
>>> pp(soup.find_all('div', class_='text-success'))
[<div class="text-success">11</div>,
<div class="text-success">1.83</div>,
<div class="text-success">4</div>,
<div class="text-success">4/6</div>,
<div class="text-success">5/6</div>,
<div class="text-success">2/6</div>,
<div class="text-success">2/6</div>,
<div class="text-success">41</div>,
<div class="text-success">2.16</div>,
<div class="text-success">11</div>,
<div class="text-success">78.9%</div>,
<div class="text-success">89.5%</div>,
<div class="text-success">21.05%</div>,
<div class="text-success">68.42%</div>]
Steve Harvey wants to know, "Could SPACE ever be part of a valid class name?" Survey says "nope!", the SPACE character is specifically prohibited.
Answered By - J_H
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.