Issue
I am trying to scrape the text of the span tags, however I get the error that "ResultSet object has no attribute 'find_all'". I think i have to have a second for-loop inside the last one. However i can't wrap my head around how to do this.
from bs4 import BeautifulSoup
import requests
urls = []
soups = []
divs = []
for i in range(20):
i=i+1
url = "https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page=" + str(i)
urls.append(url)
for url in urls:
page = requests.get(url)
soups.append(BeautifulSoup(page.content, "html.parser"))
for soup in range(len(soups)):
divs.append(soups[soup].find_all("div", class_="VehicleDetailTable_container__mUUbY"))
for div in range(len(divs)):
mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
print()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_17/4044669827.py in <module>
21
22 for div in range(len(divs)):
---> 23 mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
24 year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
25 print(mileage)
/opt/conda/lib/python3.7/site-packages/bs4/element.py in __getattr__(self, key)
2288 """Raise a helpful exception to explain a common code fix."""
2289 raise AttributeError(
-> 2290 "ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?" % key
2291 )
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
Solution
Try to avoid all these lists and loops, it needs only one and will also eliminate the error. Else you have to iterate your ResultSets
additionally, but this would not be a good behavior.
for i in range(1,21):
url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
Be aware Working with this indexes could result in texts that do not match your expectation in case there is no milage or year available. It would be better to scrape the detailpages instead.
Example
from bs4 import BeautifulSoup
import requests
for i in range(1,21):
url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
Answered By - HedgeHog
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.