Issue
I have a number of text items contained in span tags that i need to extract. I am able to do this in a list comp in the class table-main__odds as shown. I need to get the same info fronm the table-main__odds coloured tag. The logic below does not return any values. any help is appreciated?
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.betexplorer.com/soccer/england/premier-league/results/'
soup = BeautifulSoup(requests.get(url).content)
odds_raw = soup.find_all("td", class_="table-main__odds")
fav_odds_raw = soup.find_all("td",class_="table-main__odds colored")
odds = [o.get('data-odd') for o in odds_raw]
the desired result is a list with the values contained here in data-odd
Solution
Try:
import requests
import pandas as pd
from bs4 import BeautifulSoup
url = "https://www.betexplorer.com/soccer/england/premier-league/results/"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
def get_odd_or_text(td):
if "data-odd" in td.attrs:
return td["data-odd"]
odd = td.select_one("[data-odd]")
if odd:
return odd["data-odd"]
return td.get_text(strip=True)
all_data = []
for row in soup.select(".table-main tr:has(td)"):
tds = [get_odd_or_text(td) for td in row.select("td")]
round_ = row.find_previous("th").find_previous("tr").th.text
all_data.append([round_, *tds])
df = pd.DataFrame(
all_data, columns=["Round", "Match", "Score", "1", "X", "2", "Date"]
)
print(df.head().to_markdown(index=False))
df.to_csv('data.csv', index=False)
Prints:
Round | Match | Score | 1 | X | 2 | Date |
---|---|---|---|---|---|---|
14. Round | Arsenal-Nottingham | 5:0 | 1.22 | 6.75 | 13.19 | 30.10. |
14. Round | Manchester Utd-West Ham | 1:0 | 1.71 | 3.87 | 4.97 | 30.10. |
14. Round | Bournemouth-Tottenham | 2:3 | 4.97 | 3.72 | 1.74 | 29.10. |
14. Round | Brentford-Wolves | 1:1 | 2.17 | 3.43 | 3.41 | 29.10. |
14. Round | Brighton-Chelsea | 4:1 | 3.07 | 3.35 | 2.38 | 29.10. |
and saves data.csv
(screenshot from LibreOffice):
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.