Issue
I'm having trouble scraping content from the following https://pregame.com/game-center/171763/consensus-archive. I'm using Beautiful Soup and only getting back snippets of the HTML, without any of the data that I can clearly see embedded in the code.
This is the latest iteration of code I've used after a handful of attempts (this was an attempt at only grabbing the dates column, but I'm wanting to grab the entire table)...
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://pregame.com/game-center/171763/consensus-archive'
html = requests.get(url)
soup = BeautifulSoup(html.text, 'html.parser')
results = soup.find(class_ = "pg-move-list")
print(results.prettify())
dates = results.find_all("td", class_="pg-col pg-col--date")
for date in dates:
print(date, end = "\n"*2)
for date in dates:
data_date = date.find("p", class_= "pg-col-data")
print(data_date.text)
The HTML as well...
Realize there are many similar questions here and elsewhere on the web, but I'm still stuck after referencing them. Thank you in advance for the help.
Solution
Data is generating dynamically from external source via API. Bs4 can't parse/render JS that's why are grtting static portion of html only.
Example:
import pandas as pd
import requests
api_url = 'https://pregame.com/api/gamecenter/consensushistory?e=171763&s=40&r=1000&a=1&c=1&t=693'
r = requests.get(api_url)
df = pd.DataFrame(r.json()['Items'])
print(df)
Output:
Id DateTime Odds ... IsPickActionChanged PickAction PickPercentage
0 60149470 2021-10-17T12:39:05.18Z +10 ... False 172 86
1 60147744 2021-10-17T12:16:32.793Z +10 ... False 169 86
2 60146757 2021-10-17T12:00:41.64Z +10 ... False 162 86
3 60146458 2021-10-17T11:55:49.823Z +10 ... False 162 86
4 60146333 2021-10-17T11:53:50.477Z +10 ... False 162 86
.. ... ... ... ... ... ... ...
130 59716689 2021-10-12T17:41:27.397Z +10 ... False 14 82
131 59716636 2021-10-12T17:40:44.01Z +10 ... False 14 82
132 59716531 2021-10-12T17:39:28.603Z +10 ... False 14 82
133 59715523 2021-10-12T17:24:22.067Z +10 ... False 13 81
134 59655757 2021-10-11T01:02:33.873Z Other ... True 1 100
[135 rows x 12 columns]
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.