在python中从URL获取文件名和扩展名

2024-05-19 00:06:53 发布

您现在位置:Python中文网/ 问答频道 /正文

因此,我使用tkinter和urllib.request在python中制作了这个下载程序,我想让用户选择下载带有默认名称和扩展名的文件。我知道有数以百万计的教程介绍如何做到这一点,但我的问题是这个特定的URL: https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR9JHQ-y1AyCjkJt3gl0jTtNtQdhv0lCdDYxqnc2wY9zy_hSOSy 我尝试了很多代码,比如wget和urlparse,但是没有一个能够从文件的URL中获得文件的扩展名。那么还有别的办法吗? wget命令:

url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR9JHQ-y1AyCjkJt3gl0jTtNtQdhv0lCdDYxqnc2wY9zy_hSOSy'
test = wget.detect_filename(url)
print(test)

具有所述URL的输出:

images

urllib.parse命令:

url = 'https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR9JHQ-y1AyCjkJt3gl0jTtNtQdhv0lCdDYxqnc2wY9zy_hSOSy'
path = urllib.parse.urlparse(url).path
ext = os.path.splitext(path)[1]
print(path)
print(ext)

具有所述URL的输出:

/images

URL有问题吗


Tags: 文件pathhttpscomurlurllibwgetencrypted
2条回答

试试这个。还可以通过编辑final_file_name变量来更改输出文件的扩展名和名称。对于答案,我将其保留为“image.jpg”

import requests

final_url = "https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR9JHQ-y1AyCjkJt3gl0jTtNtQdhv0lCdDYxqnc2wY9zy_hSOSy"
final_file = requests.get(final_url)
final_file_name = "image.jpg"
open(final_file_name,"wb").write(final_file.content)

您应该能够从响应头中获取MIME类型,然后使用mimetypes获取要建议的扩展名

import requests, mimetypes

r = requests.get('https://encrypted-tbn0.gstatic.com/images?q=tbn%3AANd9GcR9JHQ-y1AyCjkJt3gl0jTtNtQdhv0lCdDYxqnc2wY9zy_hSOSy')
r.headers
{'Accept-Ranges': 'bytes', 'Content-Type': 'image/jpeg', 'Access-Control-Allow-Origin': '*', 'Content-Length': '4517', 'Date': 'Sun, 19 Apr 2020 09:23:26 GMT', 'Expires': 'Mon, 19 Apr 2021 09:23:26 GMT', 'Last-Modified': 'Fri, 15 Jan 2016 11:47:48 GMT', 'X-Content-Type-Options': 'nosniff', 'Server': 'sffe', 'X-XSS-Protection': '0', 'Cache-Control': 'public, max-age=31536000', 'Age': '840', 'Alt-Svc': 'quic=":443"; ma=2592000; v="46,43",h3-Q050=":443"; ma=2592000,h3-Q049=":443"; ma=2592000,h3-Q048=":443"; ma=2592000,h3-Q046=":443"; ma=2592000,h3-Q043=":443"; ma=2592000,h3-T050=":443"; ma=2592000'}

r.headers['Content-Type']
'image/jpeg'

mimetypes.guess_all_extensions(r.headers['Content-Type'], strict=False)
['.jpe', '.jpeg', '.jpg']

相关问题 更多 >

    热门问题