Issue
Here's the link where the data I want is :
Here's the data I want :
Here's the html :
And here's my script :
import numpy as np
from time import sleep
from random import randint
import requests
from requests import get
from bs4 import BeautifulSoup
import pandas as pd
import re
headers= {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'}
url0 = 'https://www.booking.com/searchresults.fr.html?label=gen173nr-1DCA0oTUIMZWx5c2Vlc3VuaW9uSA1YBGhNiAEBmAENuAEXyAEM2AED6AEB-AECiAIBqAIDuAL_5ZqEBsACAdICJDcxYjgyZmI2LTFlYWQtNGZjOS04Y2U2LTkwNTQyZjI5OWY1YtgCBOACAQ;sid=303509179a2849df63e4d1e5bc1ab1e3;dest_id=-1456928;dest_type=city&'
links1 = []
results = requests.get(url0, headers = headers)
soup = BeautifulSoup(results.text, "html.parser")
links1 = [a['href'] for a in soup.find("div", {"class": "hotellist sr_double_search"}).find_all('a', href=True)]
root_url = 'https://www.booking.com'
urls1 = [ '{root}{i}'.format(root=root_url, i=i) for i in links1 ]
results = requests.get(urls1[0])
soup = BeautifulSoup(results.text, "html.parser")
pointfort = [div['data-name-en'] for div in soup.find("div", {"class": "hp_desc_important_facilities clearfix hp_desc_important_facilities--bui"}).find_all('a')]
print(pointfort)
But I have an output like this : []
What's wrong with my code ? I would like to store in a list all this data, for each hotel, like this :
Solution
try this:
import numpy as np
from time import sleep
from random import randint
import requests
from requests import get
from bs4 import BeautifulSoup
import pandas as pd
import re
headers= {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'}
url0 = 'https://www.booking.com/searchresults.fr.html?label=gen173nr-1DCA0oTUIMZWx5c2Vlc3VuaW9uSA1YBGhNiAEBmAENuAEXyAEM2AED6AEB-AECiAIBqAIDuAL_5ZqEBsACAdICJDcxYjgyZmI2LTFlYWQtNGZjOS04Y2U2LTkwNTQyZjI5OWY1YtgCBOACAQ;sid=303509179a2849df63e4d1e5bc1ab1e3;dest_id=-1456928;dest_type=city&'
links1 = []
results = requests.get(url0, headers = headers)
soup = BeautifulSoup(results.text, "html.parser")
links1 = [a['href'] for a in soup.find("div", {"class": "hotellist sr_double_search"}).find_all('a', href=True)]
root_url = 'https://www.booking.com'
urls1 = [ '{root}{i}'.format(root=root_url, i=i) for i in links1 ]
results = requests.get(urls1[0])
soup = BeautifulSoup(results.text, "html.parser")
div = soup.find("div", {"class": "hp_desc_important_facilities clearfix hp_desc_important_facilities--bui"})
pointfort = [x['data-name-en'] for x in div.select('div[class*="important_facility"]')]
print(pointfort)
Answered By - chitown88
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.