Issue
I am trying to get the download link and download the files.
I hava a log file which contains following links:
http://www.downloadcrew.com/article/18631-aida64
http://www.downloadcrew.com/article/4475-sumo
http://www.downloadcrew.com/article/2174-iolo_system_mechanic_professional
...
...
I have a code like this:
import urllib, time
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
f = open("dcrewtest.txt")
for line in f.readlines():
try:
driver.find_element_by_xpath("//div/div[2]/div[2]/div[2]/div[3]/div/a/img").click()
time.sleep(8)
except:
pass
url = line.encode
pageurl = urllib.urlopen(url).read()
soup = BeautifulSoup(pageurl)
for a in soup.select("h1#articleTitle"):
print a.contents[0].strip()
for b in soup.findAll("th"):
if b.text == "Date Updated:":
print b.parent.td.text
elif b.text == "Developer:":
print c.parent.td.text
Up till here I do not know how to get the download link and download it. Is it possible to download the file using selenium?
Solution
According to documentation, you should configure FirefoxProfile
to automatically download files with a specified content-type. Here's an example using your first URL in the txt file that saves the exe
file in the current directory:
import os
from selenium import webdriver
fp = webdriver.FirefoxProfile()
fp.set_preference("browser.download.folderList",2)
fp.set_preference("browser.download.manager.showWhenStarting",False)
fp.set_preference("browser.download.dir", os.getcwd())
fp.set_preference("browser.helperApps.neverAsk.saveToDisk", "application/x-msdos-program")
driver = webdriver.Firefox(firefox_profile=fp)
driver.get("http://www.downloadcrew.com/article/18631-aida64")
driver.find_element_by_xpath("//div[@class='downloadLink']/a/img").click()
Note, that I've also simplified the xpath.
Answered By - alecxe
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.