Issue
This is code to pull a product name and price from a specific URL:
from bs4 import BeautifulSoup
from selenium import webdriver
import time
import csv
driver = webdriver.Chrome()
driver.get('https://www.woolworths.com.au/shop/productdetails/84552/coca-cola-classic-soft-drink-multipack-cans')
time.sleep(3)
soup = BeautifulSoup(driver.page_source, 'lxml')
name = soup.find('h1', class_="shelfProductTile-title heading3").text
price = soup.find('div', class_="price price--large").text
print(name)
print(price)
header = ['name', 'price']
data = [name, price]
with open('Test.csv', 'w', encoding='UTF8', newline='') as f:
writer = csv.writer(f)
writer.writerow(header)
writer.writerow(data)
driver.quit()
print("done")
And the output:
Coca-cola Classic Soft Drink Multipack Cans 375ml X30 Pack
$
24
.
90
done
Process finished with exit code 0
How can I remove all those line breaks, which are caused by the way the website separates the price into multiple classes, so I just get a result like
Coca-cola Classic Soft Drink Multipack Cans 375ml X30 Pack
$24.90
Solution
What about
price = soup.find('div', class_="price price--large").text.replace('\n', '')
Answered By - Vojtěch Chvojka
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.