Issue
I want to extract ads that contain two special Persian words "توافق" or "توافقی" from a website. I am using BeautifulSoup and split the content in the soup to find the ads that have my special words, but my code does not work, May you please help me? Here is my simple code:
import requests
from bs4 import BeautifulSoup
r = requests.get("https://divar.ir/s/tehran")
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", attrs={"class": "kt-post-card__body"})
for content in results:
words = content.split()
if words == "توافقی" or words == "توافق":
print(content)
Solution
Since that توافقی
is appeared in the div tags with kt-post-card__description
class, I will use this. Then you can get the adds by using tag's properties like .previous_sibling
or .parent
or whatever...
import requests
from bs4 import BeautifulSoup
r = requests.get("https://divar.ir/s/tehran")
soup = BeautifulSoup(r.text, "html.parser")
results = soup.find_all("div", attrs={"class": "kt-post-card__description"})
for content in results:
text = content.text
if "توافقی" in text or "توافق" in text:
print(content.previous_sibling) # It's the h2 title.
Answered By - S.B
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.