Issue
Hi guys i am trying to scrape some data from airbnb in order to create a mini data analysis project for my portfolio.
I tried several tutorials with BeautifulSoup
but none of them is working today, even if I use the very same link that they are using in the tutorials.
Due to this I turned to Selenium
, I achieved to enter the side and I am trying to extract the names for in the first stage. Then I would like to extract all the information (price, reviews, rating, anemities etc.)
My code is the following but I am getting an empty list. Can anyone help me how can i get the name of the appt ?
from selenium import webdriver
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
import pandas as pd
from selenium.webdriver.common.by import By
website = 'https://www.airbnb.com/s/Thessaloniki--Greece/homes?tab_id=home_tab&flexible_trip_lengths%5B%5D=one_week&refinement_paths%5B%5D=%2Fhomes&place_id=ChIJ7eAoFPQ4qBQRqXTVuBXnugk&query=Thessaloniki%2C%20Greece&date_picker_type=calendar&search_type=unknown'
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get(website)
titles = driver.find_elements("class name", "n1v28t5c s1cjsi4j dir dir-ltr")
Thanks.
Solution
Selenium with bs4 working fine without any issues and getting the right data. Just run the code.
Example:
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
import pandas as pd
import time
url = 'https://www.airbnb.com/s/Thessaloniki--Greece/homes?tab_id=home_tab&flexible_trip_lengths%5B%5D=one_week&refinement_paths%5B%5D=%2Fhomes&place_id=ChIJ7eAoFPQ4qBQRqXTVuBXnugk&query=Thessaloniki%2C%20Greece&date_picker_type=calendar&search_type=unknown'
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
driver.get(url)
driver.maximize_window()
time.sleep(5)
soup=BeautifulSoup(driver.page_source, 'lxml')
for card in soup.select('div[class="c4mnd7m dir dir-ltr"]'):
title = card.select_one('div[class="t1jojoys dir dir-ltr"]').text
price = card.select_one('span[class="a8jt5op dir dir-ltr"]').text
link = 'https://www.airbnb.com' + card.select_one('a[class="ln2bl2p dir dir-ltr"]').get('href')
print(title, price)
Output:
Condo in Thessaloniki $50 per night
Apartment in Thessaloniki $38 per night
Condo in Thessaloniki $80 per night
Apartment in Thessaloniki $66 per night
Condo in Thessaloniki $23 per night
Apartment in Thessaloniki $74 per night
Condo in Thessaloniki $37 per night
Apartment in Thessaloniki $45 per night
Apartment in Thessaloniki $39 per night
Condo in Thessaloniki $27 per night
Apartment in Thessaloniki $28 per night
Condo in Thessaloniki $43 per night
Apartment in Thessaloniki $94 per night
Apartment in Thessaloniki $24 per night
Condo in Thessaloniki $86 per night
Loft in Thessaloniki $23 per night
Apartment in ThessalonĂki $45 per night
Apartment in Thessaloniki $44 per night
Condo in Thessaloniki $50 per night
Condo in Thessaloniki $51 per night
Answered By - Fazlul
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.