Issue
I am trying to grab a rankings history weblink from one url by using the following scrapping code
import requests
from bs4 import BeautifulSoup
url = 'https://247sports.com/Player/Trevor-Lawrence-61350/college-212444/'
pageTree = requests.get(url, headers=headers)
Soup = BeautifulSoup(pageTree.content, 'html.parser')
past_link = Soup.find_all('ul', {'class':'ranks-list'})
past_link
I was able to generate this output
[<ul class="ranks-list">
<li>
<b>Natl.</b>
<a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool">
<strong>1</strong>
</a>
<a class="rank-history-link" href="https://247sports.com/PlayerSport/Trevor-Lawrence-at-Cartersville-116605/RecruitRankHistory/">
History
</a>
</li>
<li>
<b>PRO</b>
<a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool&Position=PRO">
<strong>1</strong>
</a>
</li>
<li>
<b>GA</b>
<a href="https://247sports.com/Season/2018-Football/CompositeRecruitRankings/?InstitutionGroup=HighSchool&State=GA">
<strong>1</strong>
</a>
</li>
<li>
<b>All-Time</b>
<a href="https://247sports.com/Sport/Football/AllTimeRecruitRankings/">
<strong>6</strong>
</a>
</li>
</ul>]
But going any further with something like as a "past_link.find_all('a')" led to running into errors. What do you think should be the next step from here? Any assistance is truly appreciated. Thanks in advance.
Solution
To get rankings history link from that page you can use next example:
import requests
from bs4 import BeautifulSoup
url = "https://247sports.com/Player/Trevor-Lawrence-61350/college-212444/"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:103.0) Gecko/20100101 Firefox/103.0"
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")
history_link = soup.select_one(".rank-history-link")["href"]
print(history_link)
Prints:
https://247sports.com/PlayerSport/Trevor-Lawrence-at-Cartersville-116605/RecruitRankHistory/
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.