Issue
I need to get hrefs from <a>
tags in a website, but not all, but only ones that are in the spans locted in the <div>
s with classes arm
<html>
<body>
<div class="arm">
<span>
<a href="1">link</a>
<a href="2">link</a>
<a href="3">link</a>
</span>
</div>
<div class="arm">
<span>
<a href="4">link</a>
<a href="5">link</a>
<a href="6">link</a>
</span>
</div>
<div class="arm">
<span>
<a href="7">link</a>
<a href="8">link</a>
<a href="9">link</a>
</span>
</div>
<div class="footnote">
<span>
<a href="1">anotherLink</a>
<a href="2">anotherLink</a>
<a href="3">anotherLink</a>
</span>
</div>
</body>
</html>
import requests
from bs4 import BeautifulSoup as bs
request = requests.get("url")
html = bs(request.content, 'html.parser')
for arm in html.select(".arm"):
anchor = arm.select("span > a")
print("anchor['href']")
But my code doesn't print anything
Solution
Your code looks fine until you get to the print("anchor['href']")
line which I assume is meant to be print(anchor['href'])
.
Now, anchor is a ResultSet, which means you will need another loop to get the hrefs. Here is how those final lines should look like if you want minimum modification to your code:
for arm in soup.select(".arm"):
anchor = arm.select("span > a")
for x in anchor:
print(x.attrs['href'])
We basically add:
for x in anchor:
print(x.attrs['href'])
And you should get the hrefs. All the best.
Answered By - was1209
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.