Issue
import requests
from bs4 import BeautifulSoup
from datetime import datetime
from dateutil.relativedelta import relativedelta
evr_begin = datetime.now().strftime("%m/%d/%Y")
evr_end = (datetime.now() + relativedelta(months=1)).strftime("%m/%d/%Y")
url = "https://mms.kcbs.us/members/evr_search_ol_json.php?" \
f"otype=TEXT&evr_map_type=2&org_id=KCBA&evr_begin={evr_begin}&evr_end=.
{evr_end}&" \
"evr_radius=50&evr_type=269&evr_region_type=1"
response = requests.request("GET", url)
soup = BeautifulSoup(response.text, features='lxml')
for event in soup.find_all('div', class_='row'):
print(event.find('b').getText())
print(event.find('i').getText())
Link to website https://mms.kcbs.us/members/evr_search.php?org_id=KCBA
I'm unsure on how to print what comes after the information I'm already printing. Part of the issue is some of the other texts share the same tag, while others I'm just unsure.
For Example for the first event Im needing to print
Frisco, CO 80443 UNITED STATES STATE CHAMPIONSHIP Reps: BUNNY TUTTLE, RICH TUTTLE, MICHAEL WINTER Prize Money: $13,050.00
all separately.
If i use
print(event.find('div', class_='col-md-4').getText()) within the for loop it will print it clumped together
Solution
What I would do is create a dictionary containing all the names for the different pieces of data mapped to the order in which they appear in each row of the table. Then collect each row into it's own dictionary and append them to a list for you to deal with once it's all finished parsing.
For Example:
import requests
from bs4 import BeautifulSoup
from datetime import datetime
from dateutil.relativedelta import relativedelta
import json
data = {
0:{ 0:"title", 1:"dates", 2:"city/state", 3:"country" },
1:{ 0:"event", 1:"reps", 2:"prize" },
2:{ 0:"results" }
}
evr_begin = datetime.now().strftime("%m/%d/%Y")
evr_end = (datetime.now() + relativedelta(months=1)).strftime("%m/%d/%Y")
url = f"https://mms.kcbs.us/members/evr_search_ol_json.php?otype=TEXT&evr_map_type=2&org_id=KCBA&evr_begin={evr_begin}&evr_end=.{evr_end}&evr_radius=50&evr_type=269&evr_region_type=1"
response = requests.request("GET", url)
print(response.content)
soup = BeautifulSoup(response.text, features='lxml')
all_data = []
for element in soup.find_all('div', class_="row"):
event = {}
for i, col in enumerate(element.find_all('div', class_='col-md-4')):
for j, item in enumerate(col.strings):
event[data[i][j]] = item
all_data.append(event)
print(json.dumps(all_data,indent=4))
The output would look something like this:
{
"title": "Frisco BBQ Challenge",
"dates": "6/16/2022 - 6/18/2022",
"city/state": "Frisco, CO 80443",
"country": "UNITED STATES",
"event": "STATE CHAMPIONSHIP",
"reps": "Reps: BUNNY TUTTLE, RICH TUTTLE, MICHAEL WINTER",
"prize": "Prize Money: $13,050.00",
"results": "Results Not In"
},
{
"title": "York County BBQ Festival",
"dates": "6/17/2022 - 6/18/2022",
"city/state": "Delta, PA 17314",
"country": "UNITED STATES",
"event": "STATE CHAMPIONSHIP",
"reps": "Reps: ANGELA MCKEE, ROBERT MCKEE, LOUISE WEIDNER",
"prize": "Prize Money: $5,500.00",
"results": "Results Not In"
},
...
Answered By - alexpdev
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.