Issue
Hi I tried to scraper the following site: https://www.footlocker.co.uk/en/all/new/
I want to scrape the price and the 'href' of the following element:
<span class=" fl-price--sale ">
<meta itemprop="priceCurrency" content="GBP">
<meta itemprop="price" content="84.99"><span>£ 84,99</span>
</span>
and this (href):
<a href="https://www.footlocker.co.uk/en/p/adidas-performance-don-issue-2-men-shoes-92815?v=314102617504#!searchCategory=all" data-product-click-link="314102617504" data-hash-key="searchCategory" data-hash-url="https://www.footlocker.co.uk/en/p/adidas-performance-don-issue-2-men-shoes-92815?v=314102617504" data-testid="fl-product-details-link-314102617504">
I have tried this code:
import urllib.request
import bs4 as bs
from bs4 import BeautifulSoup
import requests
proxies = {'type':'ip:port'}
r= requests.get('https://www.footlocker.de/de/alle/new/', proxies=proxies)
soup = BeautifulSoup(r.content,'html.parser')
# It don't find it...
for a in (soup.find_all('a')):
try:
if a['href'] == 'https://www.footlocker.co.uk/en/p/adidas-performance-don-issue-2-men-shoes-92815?v=314102617504#!searchCategory=all':
print(a['href'])
except:
pass
# It don't find it...
for price in (soup.find_all('span', class_=' fl-price--sale ')):
print(price.text)
I have tried to scrape with a proxy but he refuse to scrape the element (I think the HTML isn't right)
Thanks for your advices :-) (For education propose only)
Solution
To get names, links and prices of the products, you can use this example:
import requests
from bs4 import BeautifulSoup
url = 'https://www.footlocker.co.uk/INTERSHOP/web/FLE/Footlocker-Footlocker_GB-Site/en_GB/-/GBP/ViewStandardCatalog-ProductPagingAjax?SearchParameter=____&sale=new&MultiCategoryPathAssignment=all&PageNumber={}'
for page in range(3): # <--- increase the number of pages here
print('Page {}...'.format(page))
data = requests.get(url.format(page)).json()
soup = BeautifulSoup(data['content'], 'html.parser')
for d in soup.select('[data-request]'):
s = BeautifulSoup(requests.get(d['data-request']).json()['content'], 'html.parser')
print(s.select_one('[itemprop="name"]').text)
print(s.select_one('[itemprop="price"]')['content'], s.select_one('[itemprop="priceCurrency"]')['content'])
print(s.a['href'])
print('-' * 80)
Prints:
Page 0...
adidas Performance Don Issue 2 - Men Shoes
84.99 GBP
https://www.footlocker.co.uk/en/p/adidas-performance-don-issue-2-men-shoes-92815?v=314102617504
--------------------------------------------------------------------------------
Nike Air Force 1 Crater - Women Shoes
94.99 GBP
https://www.footlocker.co.uk/en/p/nike-air-force-1-crater-women-shoes-98071?v=315349054502
--------------------------------------------------------------------------------
Jordan Jumpmcn Cl Iii Camo - Baby Tracksuits
39.99 GBP
https://www.footlocker.co.uk/en/p/jordan-jumpmcn-cl-iii-camo-baby-tracksuits-91611?v=318280390044
--------------------------------------------------------------------------------
Jordan 13 Retro - Grade School Shoes
99.99 GBP
https://www.footlocker.co.uk/en/p/jordan-13-retro-grade-school-shoes-952?v=316701533404
--------------------------------------------------------------------------------
...and so on.
Answered By - Andrej Kesely
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.