无法从服务器获取搜索结果:使用Python的Facebook Graph API
我在自己做一些简单的Python和Facebook Graph的练习时,遇到了一个奇怪的问题:
import time
import sys
import urllib2
import urllib
from json import loads
base_url = "https://graph.facebook.com/search?q="
post_id = None
post_type = None
user_id = None
message = None
created_time = None
def doit(hour):
page = 1
search_term = "\"Plastic Planet\""
encoded_search_term = urllib.quote(search_term)
print encoded_search_term
type="&type=post"
url = "%s%s%s" % (base_url,encoded_search_term,type)
print url
while(1):
try:
response = urllib2.urlopen(url)
except urllib2.HTTPError, e:
print e
finally:
pass
content = response.read()
content = loads(content)
print "=================================="
for c in content["data"]:
print c
print "****************************************"
try:
content["paging"]
print "current URL"
print url
print "next page!------------"
url = content["paging"]["next"]
print url
except:
pass
finally:
pass
"""
print "new URL is ======================="
print url
print "=================================="
"""
print url
我想要做的是自动翻页查看搜索结果,试着用content["paging"]["next"]来获取下一页的内容。
但奇怪的是,没有返回任何数据;我收到的是以下内容:
{"data":[]}
即使在第一次循环的时候也是这样。
不过当我把这个网址复制到浏览器里时,却能返回很多结果。
我还试过用我的访问令牌,但结果还是一样。
+++++++++++++++++++已编辑和简化++++++++++++++++++
感谢TryPyPy,这里是我之前问题的简化和编辑版本:
为什么会出现:
import urllib2
url = "https://graph.facebook.com/searchq=%22Plastic+Planet%22&type=post&limit=25&until=2010-12-29T19%3A54%3A56%2B0000"
response = urllib2.urlopen(url)
print response.read()
结果是{"data":[]}
?
但同样的网址在浏览器中却能产生很多数据?
1 个回答
1
我在使用Chrome(得到了很多数据)和Firefox(得到了空响应)进行反复尝试后,发现问题主要出在'Accept-Language'这个请求头上。其他的修改应该只是表面上的变化,但我对CookieJar的情况不太确定。
import time
import sys
import urllib2
import urllib
from json import loads
import cookielib
base_url = "https://graph.facebook.com/search?q="
post_id = None
post_type = None
user_id = None
message = None
created_time = None
jar = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
opener.addheaders = [
('Accept-Language', 'en-US,en;q=0.8'),]
def doit(hour):
page = 1
search_term = "\"Plastic Planet\""
encoded_search_term = urllib.quote(search_term)
print encoded_search_term
type="&type=post"
url = "%s%s%s" % (base_url,encoded_search_term,type)
print url
data = True
while data:
response = opener.open(url)
opener.addheaders += [
('Referer', url) ]
content = response.read()
content = loads(content)
print "=================================="
for c in content["data"]:
print c.keys()
print "****************************************"
if "paging" in content:
print "current URL"
print url
print "next page!------------"
url = content["paging"]["next"]
print url
else:
print content
print url
data = False
doit(1)
这是一个简化过的、能正常工作的版本:
import urllib2
import urllib
from json import loads
import cookielib
def doit(search_term, base_url = "https://graph.facebook.com/search?q="):
opener = urllib2.build_opener()
opener.addheaders = [('Accept-Language', 'en-US,en;q=0.8')]
encoded_search_term = urllib.quote(search_term)
type="&type=post"
url = "%s%s%s" % (base_url,encoded_search_term,type)
print encoded_search_term
print url
data = True
while data:
response = opener.open(url)
content = loads(response.read())
print "=================================="
for c in content["data"]:
print c.keys()
print "****************************************"
if "paging" in content:
url = content["paging"]["next"]
else:
print "Empty response"
print content
data = False
doit('"Plastic Planet"')