无法在python中获取页面源代码

网友

1楼 · 编辑于 2024-04-24 04:38:16

如Martin Maillard所示，requests库为我工作。

另外，在另一个线程中，我注意到这个注释是leoluk here：

Edit: It's 2014 now, and most of the important libraries have been ported and you should definitely use Python 3 if you can. python-requests is a very nice high-level library which is easier to use than urllib2.

所以我写了一个获取页面的过程：

import requests
def get_page (website_url):
    response = requests.get(website_url)
    return response.content

print get_page('http://example.com')

干杯！

网友

2楼 · 编辑于 2024-04-24 04:38:16

我尝试了很多东西，“urllib”“urllib2”和其他很多东西，但有一件事对我来说对我所需要的一切都有用，解决了我所面临的任何问题。它是Mechanize。这个库模拟使用一个真正的浏览器，所以它处理了这个领域的许多问题。

网友

3楼 · 编辑于 2024-04-24 04:38:16

我试过了，请求也有效，但是你收到的内容说你的浏览器必须接受cookies（法语）。你也许可以用urllib2来解决这个问题，但我认为最简单的方法是使用requests库（如果你不介意有额外的依赖关系的话）。

要安装requests：

pip install requests

然后在你的剧本里：

import requests

url = 'http://france.meteofrance.com/france/meteo?PREVISIONS_PORTLET.path=previsionsville/750560'

response = requests.get(url)
print(response.content)

我很确定这个页面的源代码将是你所期望的。

相关问题更多 >

编程相关推荐

热门问题

热门文章

无法在python中获取页面源代码

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >