Issue
I have the following example
<h2 class="m0 t-regular">
<a data-js-aid="jobID" data-js-link="" href="/en/qatar/jobs/executive-chef-4276199/" data-job-id="4276199">
Executive Chef </a>
</h2>
How to find the "a" tag ??
Until now it return empty result:
import time
import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(
requests.get("https://www.bayt.com/en/international/jobs/executive-chef-jobs/").content,
"lxml"
)
follow_links = [
a["href"] for a in
soup.find_all("h2", class_="m0 t-regular")
if "#" not in a["href"]
]
print(follow_links)
result :
[]
Question is how to return the link ?
Solution
You are close to it, use ['href']
to get the url.
Example
import requests
from bs4 import BeautifulSoup
soup = BeautifulSoup(
requests.get("https://www.bayt.com/en/international/jobs/executive-chef-jobs/").content,
"lxml"
)
links = []
for a in soup.select("h2.m0.t-regular a"):
if a['href'] not in links:
links.append(a['href'])
links
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.