Selenium 下载文件
我正在尝试写一个Selenium程序,目的是自动下载和上传一些文件。
需要说明的是,我这样做不是为了测试,而是想自动化一些任务。
这是我为Firefox浏览器设置的偏好选项
profile.set_preference('browser.download.folderList', 2) # custom location
profile.set_preference('browser.download.manager.showWhenStarting', False)
profile.set_preference('browser.download.dir', '/home/jj/web')
profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'application/json, text/plain, application/vnd.ms-excel, text/csv, text/comma-separated-values, application/octet-stream')
profile.set_preference("browser.helperApps.alwaysAsk.force", False);
但是,我仍然会看到下载的对话框。
1 个回答
5
Selenium的Firefox浏览器驱动程序会打开Firefox的图形界面。当你开始下载时,Firefox会弹出一个窗口,询问你是想查看文件还是保存文件。根据我的了解,这个行为是浏览器的特性,无法通过Firefox的设置或配置文件来关闭。为了避免Firefox的下载弹窗,我使用了Mechanize和Selenium的组合。我先用Selenium获取下载链接,然后把这个链接传给Mechanize来实际下载。Mechanize没有图形界面,所以不会弹出任何用户界面的窗口。
下面这段代码是用Python写的,属于一个可以执行下载操作的类。
# These imports are required
from selenium import webdriver
import mechanize
import time
# Start the firefox browser using Selenium
self.driver = webdriver.Firefox()
# Load the download page using its URL.
self.driver.get(self.dnldPageWithKey)
time.sleep(3)
# Find the download link and click it
elem = self.driver.find_element_by_id("regular")
dnldlink = elem.get_attribute("href")
logfile.write("Download Link is: " + dnldlink)
pos = dnldlink.rfind("/")
dnldFilename = dnldlink[pos+1:]
dnldFilename = "/home/<mydir>/Downloads/" + dnldFilename
logfile.write("Download filename is: " + dnldFilename)
#### Now Using Mechanize ####
# Above, Selenium retrieved the download link. Because of Selenium's
# firefox download issue: it presents a download dialog that requires
# user input, Mechanize will be used to perform the download.
# Setup the mechanize browser. The browser does not get displayed.
# It is managed behind the scenes.
br = mechanize.Browser()
# Open the login page, the download requires a login
resp = br.open(webpage.loginPage)
# Select the form to use on this page. There is only one, it is the
# login form.
br.select_form(nr=0)
# Fill in the login form fields and submit the form.
br.form['login_username'] = theUsername
br.form['login_password'] = thePassword
br.submit()
# The page returned after the submit is a transition page with a link
# to the welcome page. In a user interactive session the browser would
# automtically switch us to the welcome page.
# The first link on the transition page will take us to the welcome page.
# This step may not be necessary, but it puts us where we should be after
# logging in.
br.follow_link(nr=0)
# Now download the file
br.retrieve(dnldlink, dnldFilename)
# After the download, close the Mechanize browser; we are done.
br.close()
这个方法对我有效,希望对你也有帮助。如果有更简单的解决办法,我很想知道。