Issue
I'm trying to create a Python script that gets information about flyers on the site "doveconviene.it", this site has some flyers (not every flyer, see next paragraph) with clickable products that gives the product name and price inside a modal div that appers when you click on the product, this modal is empty until you click on a product of the flyer, so, to retrieve the informations i need, i have to click on the product.
To replicate what i'm saying:
- Go to "doveconviene.it", select a supermarket flyer with products that become darker when you pass the mouse (try with Lidl, Eurospin, MD, Penny flyer or other supermarket flyers).
- Click on a product.
- A modal window will appear with the product image, name and price.
I'm trying to use Selenium in Python to scroll through each product in the flyer, click on it, retrieve the product name and price from the modal div and then move to the next product div in the same page to do the same.
The problem is that, using Chrome, when right-click and inspect the element of a product, the div returned is not clickable so i can't automate clicks on those elements to retrive the informations i need.
Every product of the flyer has the following HTML structure:
<div class = "css-0 e5l4rtr2">
<div class = "css-XXXXX e5l4rtr0">
<div role = "button">
</div>
</div>
<div class = "css-XXXXX e5l4rtr1"> <- Div selected by Chrome's 'inspect' option
</div>
</div>
This is what i did to get the flyer page i'm working on:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
options = webdriver.ChromeOptions()
options.add_argument('--headless=new') # Run Chrome in headless mode
driver = webdriver.Chrome()
driver.get("https://www.doveconviene.it/discount/penny/volantino/ultime-offerte-penny?flyerId=1117478&flyerPage=1")
driver.maximize_window()
To select the div provided by Chrome's 'inspect' option related to a product, i'm using the following istructions:
productDiv = WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'e5l4rtr1')]")))
WebDriverWait(driver, 10).until(EC.invisibility_of_element_located(productDiv))
productDiv.click()
Latter instruction gives me the following error, so this div is not clickable (isn't it?):
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element <div x="29.55555555555556" y="423.11111111111103" width="276.44444444444446" height="352.0000000000001" class="css-ghw7y6 e5l4rtr1"></div> is not clickable at point (1331, 666).
I've tried without the second instruction but without success.
I've also tried .click() on the other div that has role = button attribute, again without success.
Is there a way to achieve what I'm trying to do?
Which element of the page needs be clicked to open the modal window?
Solution
The error you see is caused by the 'privacy settings' (accept cookies) alert which is displayed when you enter the website for the 1st time (from a new user session).
You can disable the headless mode to see that alert.
To resolve this, you can either enter the website > add cookie (in your case cookie name is OptanonConsent) > refresh the page
Or you can handle it explicitly via Selenium:
wait = WebDriverWait(driver, 15)
short_wait = WebDriverWait(driver, 2)
driver.get("https://www.doveconviene.it/discount/penny/volantino/ultime-offerte-penny?flyerId=1117478&flyerPage=1")
driver.maximize_window()
# handle cookies modal
try:
el = short_wait.until(EC.presence_of_element_located((By.ID, "onetrust-accept-btn-handler")))
if el.is_displayed():
el.click()
except TimeoutException:
pass
wait.until(EC.invisibility_of_element_located((By.ID, "onetrust-accept-btn-handler")))
# parsing logic goes here
# parsing logic goes here
# parsing logic goes here
wait.until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'e5l4rtr1')]")))
products = driver.find_elements(By.XPATH, "//div[contains(@class, 'e5l4rtr1')]")
for product in products:
product.click()
name = wait.until(EC.visibility_of_element_located((By.XPATH, "//p[contains(@class, 'e6sviiv0')]"))).text
price = wait.until(EC.visibility_of_element_located((By.XPATH, "//span[contains(@class, 'e10wf1jb0')]"))).text
print(f"Collected {name} with price {price}")
Also, I'd recommend you to pay attention to locators, classes like 'e5l4rtr1' can by dynamic (i.e. generated by the frontend framework).
Answered By - sashkins
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.