Issue
https://www.ejendomstorvet.dk/ledigelokaler/koebenhavn-by/detailhandel-butik
I tried to scrape this real estate link. It shows it has 241 results, but I tried several times it can only scrape out 12 results.
from bs4 import BeautifulSoup
import requests
from csv import writer
url = "https://www.ejendomstorvet.dk/ledigelokaler/detailhandel-butik"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
lists = soup.find_all('div', class_ ="propcontainer")
with open('ejendomstorvet_butik_03-140322.csv', 'w', encoding='utf8', newline='') as f:
thewriter = writer(f)
header = ['title','address_01','address_02','size', 'price','link']
thewriter.writerow(header)
for e in lists:
title = e.find('div', class_="prop__intro").text.replace('\r\n','')
address_01 = e.find('div', class_="prop__address").text.replace('\r\n','')
address_02 = e.find('div', class_="prop__address2").text.replace('\r\n','')
size = e.find('span', class_="prop__size").text.replace('\r\n','')
price = e.find('span', class_="prop__price").text.replace('\r\n','')
link = e.find('a' , href=True)
info = [title,address_01,address_02,size,price,link]
thewriter.writerow(info)
Solution
Actually full data is populated dynamically by JavaScript from api calls json response. Here is a working example how to collect all results.
Script:
import requests
import json
import pandas as pd
cookies={'Cookie': 'ASP.NET_SessionId=unma1hbvnwdd53w0iqfh5mfl; search=id=3ee38b0e-551b-4190-a617-cf0dd020d99a&itemtype=OwnUse&url=/ledigelokaler/koebenhavn-by/detailhandel-butik&convertedsearch=0; usercookie=id=f3c5912a-aa72-4a78-bff0-826eaab28072&c=MDMvMTQvMjAyMiAxOTo1NDoxNw==&data=JGYzYzU5MTJhLWFhNzItNGE3OC1iZmYwLTgyNmVhYWIyODA3MgCAo5vl0gXaiAEBJGYzYzU5MTJhLWFhNzItNGE3OC1iZmYwLTgyNmVhYWIyODA3Mglhbm9ueW1vdXMA; settings=usersettings=AAEAAAAAAAZsYXRlc3QABmNsb3NlZAA=; prism_610987956=e5c78891-283e-43ad-a8d9-43db0a7dd452; _clck=1xe2hka|1|ezr|0; CookieInformationConsent=%7B%22website_uuid%22%3A%2281ea14e2-c192-405e-8523-46a926c64030%22%2C%22timestamp%22%3A%222022-03-14T15%3A55%3A15.512Z%22%2C%22consent_url%22%3A%22https%3A%2F%2Fwww.ejendomstorvet.dk%2Fledigelokaler%2Fkoebenhavn-by%2Fdetailhandel-butik%22%2C%22consent_website%22%3A%22Ejendomstorvet.dk%22%2C%22consent_domain%22%3A%22www.ejendomstorvet.dk%22%2C%22user_uid%22%3A%227af36c12-088f-441a-a76a-1a52df8f229a%22%2C%22consents_approved%22%3A%5B%22cookie_cat_necessary%22%2C%22cookie_cat_functional%22%2C%22cookie_cat_statistic%22%2C%22cookie_cat_marketing%22%2C%22cookie_cat_unclassified%22%5D%2C%22consents_denied%22%3A%5B%5D%2C%22user_agent%22%3A%22Mozilla%2F5.0%20%28Windows%20NT%2010.0%3B%20Win64%3B%20x64%29%20AppleWebKit%2F537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome%2F99.0.4844.51%20Safari%2F537.36%22%7D; _gcl_au=1.1.273465374.1647273316; _gid=GA1.2.924375637.1647273316; _ga_0Q5HR8S1F9=GS1.1.1647273305.1.1.1647273357.18; _ga=GA1.2.1641080384.1647273306; _uetsid=073761a0a3af11ecacdf8b5661088ce4; _uetvid=07379f70a3af11ecbff1db1567aff869; _clsk=nd9336|1647273413426|2|1|d.clarity.ms/collect'}
headers= {
'X-Requested-With': 'XMLHttpRequest'}
api_url = "https://www.ejendomstorvet.dk/search/result?gethighlighted=false&imagewidth=620&imageheight=400"
jsonData=requests.get(api_url, headers=headers,cookies=cookies).json()
data=[]
for page in range(1,17,1):
#print(page)
jsonData['NumberOfPages'] = page
for item in jsonData['PropertyResultList']:
title=item['Flashline']
url=item['RefUrl']
address=item['Address']
city=item['City']
data.append([title,url,address,city])
#print(title)
cols=['title','url','address','city']
df = pd.DataFrame(data,columns=cols)
print(df)
#df.to_csv('output.csv',index=False) #to store data
Output:
title ... city
0 Nyopført ejendom ved Ny Ellebjerg st. ... København SV
1 Nyopført ejendom ved Ny Ellebjerg st. ... København SV
2 Nyopført ejendom ved Ny Ellebjerg St. ... København SV
3 Nyopført ejendom ved Ny Ellebjerg St. ... København SV
4 Nyopført ejendom ved Ny Ellebjerg St. ... København SV
.. ... ... ...
187 Strandgade 7, 1401 København K ... København K
188 Centralt beliggende erhvervslokale på Frederik... ... Frederiksberg C
189 Charmerende café ... København Ø
190 Velbeliggende mindre butik/café centralt i Øre... ... København S
191 Produktionskøkken til leje på Frederiksberg ... Frederiksberg
[192 rows x 4 columns]
Answered By - F.Hoque
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.