Issue
So there is this e-commerce page https://www.jooraccess.com/r/products?token=feba69103f6c9789270a1412954cf250 and there are hundreds of products, and for each product there is a slider with images (or slideshow or whatever you call it). I just need to scrape all the images from the page. I understand how to grab first images in each slider, I just can't figure out how to scrape the rest of the images in each slider.
I have inspected the element and noticed that each time I change the image in the slider, this part
<div data-position="4" class="PhotoBreadcrumb_active__2T6z2 PhotoBreadcrumb_dot__2PbsQ"></div>
moves down these positions (in the example below image#4 is selected)
<div class="PhotoBreadcrumb_breadcrumbContainer__2cALf" data-testid="breadcrumbContainer">
<div data-position="0" class="PhotoBreadcrumb_dot__2PbsQ"></div>
<div data-position="1" class="PhotoBreadcrumb_dot__2PbsQ"></div>
<div data-position="2" class="PhotoBreadcrumb_dot__2PbsQ"></div>
<div data-position="3" class="PhotoBreadcrumb_dot__2PbsQ"></div>
<div data-position="4" class="PhotoBreadcrumb_active__2T6z2 PhotoBreadcrumb_dot__2PbsQ"></div>
<div data-position="5" class="PhotoBreadcrumb_dot__2PbsQ"></div>
</div>
Solution
To scrape all the values of the src
attributes from the first slide you need to:
Click on each slide inducing WebDriverWait for the element_to_be_clickable()
Collect the value of each
src
attribute inducing WebDriverWait for the visibility_of_element_located()You can use the following locator strategies:
driver.get("https://www.jooraccess.com/r/products?token=feba69103f6c9789270a1412954cf250") print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//img"))).get_attribute("src")) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//div[@data-position='1']"))).click() print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//img"))).get_attribute("src")) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//div[@data-position='2']"))).click() print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//img"))).get_attribute("src")) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//div[@data-position='3']"))).click() print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//img"))).get_attribute("src")) WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//div[@data-position='3']"))).click() print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'Grid_Row__2R-IV') and contains(@class, 'Grid_left')]/div//img"))).get_attribute("src"))
Console Output:
https://cdn.jooraccess.com/img/uploads/accounts/678917/images/Sundays_NYC_3202%20(1).jpg https://cdn.jooraccess.com/img/uploads/accounts/678917/images/Sundays_NYC_3207.jpg https://cdn.jooraccess.com/img/uploads/accounts/678917/images/Maya%20dress_Floral03.jpg https://cdn.jooraccess.com/img/uploads/accounts/678917/images/Maya%20dress_Floral04.jpg https://cdn.jooraccess.com/img/uploads/accounts/678917/images/Maya%20dress_Floral05.jpg
Answered By - undetected Selenium
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.