Issue
I am trying to run a loop in a web scraping script that uses Beautiful Soup to extract data from this Page. The loop will loop through each div tag and extract 4 different pieces of information. It searches a h3, a div, and 2 span tags. But when I add the ".text" option I get errors from the 'date,' 'soldprice,' and 'shippingprice.' The error says:
AttributeError: 'NoneType' object has no attribute 'text'
I can get the text value from the 'title,' but nothing else when i put ".text" at the end of the line or in the print function. The script overall will extract the correct information when it is run, however I don't want the html tags.
results = soup.find_all("div", {"class": "s-item__info clearfix"}) #to separate the section of text for each item on the page
for item in results:
product = {
'title': item.find("h3", attrs={"class": "s-item__title s-item__title--has-tags"}).text,
'date': item.find("div", attrs={"class": "s-item__title--tag"}), #.find("span", attrs={"class": "POSITIVE"}),
'soldprice': item.find("span", attrs={"class": "s-item__price"}),
'shippingprice': item.find("span", attrs={"class": "s-item__shipping s-item__logisticsCost"}),
}
print(product)
Solution
Problem is because before offers there is other div
with class="s-item__info clearfix"
but without date, soldprice,shippingprice
.
You have to add find
to search only in offers
results = soup.find('div', class_='srp-river-results clearfix').find_all("div", {"class": "s-item__info clearfix"})
Answered By - furas
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.