urllib2 文件名

31 投票

14 回答

40101 浏览

提问于 2025-04-11 09:23

如果我用urllib2打开一个文件，像这样：

remotefile = urllib2.urlopen('http://example.com/somefile.zip')

有没有简单的方法可以获取文件名，而不是解析原始的URL呢？

编辑：把openfile改成了urlopen……不太确定这是怎么发生的。

编辑2：我最后使用了：

filename = url.split('/')[-1].split('#')[0].split('?')[0]

如果我没记错的话，这样应该也能去掉所有可能的查询参数。

14 个回答

我觉得“文件名”在进行HTTP传输时并不是一个很明确的概念。服务器可能会提供一个文件名，这个文件名是在“content-disposition”这个头信息里，但这并不是必须的。你可以尝试用 remotefile.headers['Content-Disposition'] 来获取这个文件名。如果这样获取失败了，你可能就需要自己从网址中提取文件名了。

回答于 2025-04-11 由 Python大师

分享举报

如果你只想要文件名本身，假设在网址的最后没有像 http://example.com/somedir/somefile.zip?foo=bar 这样的查询变量，那么你可以使用 os.path.basename 来做到这一点：

[user@host]$ python
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04) 
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.path.basename("http://example.com/somefile.zip")
'somefile.zip'
>>> os.path.basename("http://example.com/somedir/somefile.zip")
'somefile.zip'
>>> os.path.basename("http://example.com/somedir/somefile.zip?foo=bar")
'somefile.zip?foo=bar'

还有一些其他的帖子提到可以使用 urlparse，这也是可以的，但你还需要去掉文件名之前的目录部分。如果你使用 os.path.basename()，那么就不需要担心这个问题，因为它只会返回网址或文件路径的最后一部分。

回答于 2025-04-11 由 Python大师

分享举报

你是想问 urllib2.urlopen 吗？

如果服务器发送了一个叫做 Content-Disposition 的头信息，你可能可以获取到想要的文件名，这可以通过检查 remotefile.info()['Content-Disposition'] 来实现。不过现在的情况是，你可能还是得自己从网址中提取文件名。

你可以使用 urlparse.urlsplit，但是如果你的网址像第二个例子那样，你最终还是得自己提取文件名：

>>> urlparse.urlsplit('http://example.com/somefile.zip')
('http', 'example.com', '/somefile.zip', '', '')
>>> urlparse.urlsplit('http://example.com/somedir/somefile.zip')
('http', 'example.com', '/somedir/somefile.zip', '', '')

那不如直接这样做：

>>> 'http://example.com/somefile.zip'.split('/')[-1]
'somefile.zip'
>>> 'http://example.com/somedir/somefile.zip'.split('/')[-1]
'somefile.zip'

回答于 2025-04-11 由 Python大师

分享举报

urllib2 文件名

14 个回答

撰写回答