使用python3.4从Google Patents下载文件

2条回答

网友

1楼 · 编辑于 2024-06-10 06:46:23

据我所知，你寻求一个命令，将模拟左击文件，并自动下载。如果是这样，你可以用硒。比如：

from selenium import webdriver 
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
profile = FirefoxProfile ()
profile.set_preference("browser.download.folderList",2)
profile.set_preference("browser.download.manager.showWhenStarting",False)
profile.set_preference("browser.download.dir", 'D:\\') #choose folder to download to
profile.set_preference("browser.helperApps.neverAsk.saveToDisk",'application/octet-stream')
driver = webdriver.Firefox(firefox_profile=profile)
driver.get('https://www.google.com/googlebooks/uspto-patents-grants-text.html#2015')
filename = driver.find_element_by_xpath('//a[contains(text(),"ipg150106.zip")]') #use loop to list all zip files
filename.click()

已更新！'应使用application/octet stream的zip mime类型，而不是“application/zip”。现在它应该可以工作了：）

网友

2楼 · 编辑于 2024-06-10 06:46:23

你正在下载的html是链接页面。你需要解析html来找到所有的下载链接。你可以用一个像靓汤一样的图书馆来做这个。在

但是，该页面的结构非常规则，因此您可以使用正则表达式获取所有下载链接：

import re

html = urllib.request.urlopen(url).read()
links = re.findall('<a href="(.*)">', html)

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python3.4从Google Patents下载文件

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >