Issue
I am doing a project mentioned in the book 'Automate the Boring Stuff with Python'. Task is to download all images for a specific search tag from either flicks or imgur. So I chose flickr but my Timeout Exception doesn't work even if it's defined already.
#! python3
# download all images from Flicker using user defined Tag
import logging, os, time, requests
import pyinputplus as pyip
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
logging.basicConfig(level=logging.DEBUG, format=' %(asctime)s - %(levelname)s - %(message)s')
# Ask user for a Tag
print("Hello user, this utility helps you to download images of specific 'tag' to your PC.\n")
tag = pyip.inputStr("Please choose a tag:", allowRegexes=[r'\s'])
os.makedirs(tag, exist_ok=True) # creates folder named as user define tag
# get url
baseurl = "https://www.flickr.com/"
search_prefix= "search/?text="
search_suffix= "&media=photos"
search_url = baseurl + search_prefix + tag + search_suffix
logging.debug(f'{search_url}')
# opens firefox and get list of image links
driver = webdriver.Firefox()
driver.get(search_url)
time.sleep(2)
item_links = []
for item in driver.find_elements_by_class_name('overlay'):
item_links.append(item.get_attribute('href'))
print('Python script has found %s images.\n' % (len(item_links)))
count = pyip.inputInt('How many images would you like to download?')
# getting all links for images
image_links = []
for i in range(0,count):
image_link = str(item_links[i]) + 'sizes/l/'
logging.debug(f'{image_link}')
driver.get(image_link)
try:
webelement = WebDriverWait(driver,5).until(EC.presence_of_element_located((By.CSS_SELECTOR,'#allsizes-photo > img:nth-child(1)')))
image_links.append(webelement.get_attribute('src'))
except TimeoutException as e:
pass
# downloading all images
for link in image_links:
response = requests.get(link)
response.raise_for_status()
file = open(os.path.join(tag,os.path.basename(link)), 'wb')
for chunk in response.iter_content(100000):
file.write(chunk)
file.close()
print('Done.')
Everything is working fine until webelement is not located due to download restrictions even though "except TimeoutExceptions' is defined. I found out that the css_selector changes from #allsizes-photo > img:nth-child(1) (if present) to #allsizes-photo > img:nth-child(2) (if not) but I have an impression that the exception should be able to handle that. Could you please advise what am I doing wrong?
Link examples as requested:
webelement present flickr.com/photos/bjarnekosmeijer/48718018992/sizes/l
timeoutexception flickr.com/photos/ebanatawka/5062862631/sizes/l
Solution
I am using this CSS
div#allsizes-photo img
with visibility_of_element_located
in explicit wait
Code :
driver.get("https://www.flickr.com/photos/bjarnekosmeijer/48718018992/sizes/l/")
webelement = WebDriverWait(driver,5).until(EC.visibility_of_element_located((By.CSS_SELECTOR,"div#allsizes-photo img")))
print(webelement.get_attribute('src'))
Output
https://live.staticflickr.com/65535/48718018992_7cbf09802a_b.jpg
Answered By - cruisepandey
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.