如何停止、终止或关闭使用Twitter Stream的PycURL请求
我现在正在使用cURL从Twitter的API流中获取数据(地址是http://stream.twitter.com/1/statuses/sample.json),所以我一直在接收数据。我希望在获取到一定数量的数据后停止这个流(在这个例子中,我随便设定了10个作为数量)。
你可以看到我在下面的代码中尝试关闭连接。因为这个数据流是持续不断的,所以代码中的curling.perform()这一行永远不会执行。因此,我尝试在body_callback中关闭这个流,但因为perform()正在运行,我无法调用close()。
如果有人能帮忙就太好了。
代码:
# Imports
import pycurl # Used for doing cURL request
import base64 # Used to encode username and API Key
import json # Used to break down the json objects
# Settings to access stream and API
userName = 'twitter_username' # My username
password = 'twitter_password' # My API Key
apiURL = 'http://stream.twitter.com/1/statuses/sample.json' # the twitter api
tweets = [] # An array of Tweets
# Methods to do with the tweets array
def how_many_tweets():
print 'Collected: ',len(tweets)
return len(tweets)
class Tweet:
def __init__(self):
self.raw = ''
self.id = ''
self.content = ''
def decode_json(self):
return True
def set_id(self):
return True
def set_content(self):
return True
def set_raw(self, data):
self.raw = data
# Class to print out the stream as it comes from the API
class Stream:
def __init__(self):
self.tweetBeingRead =''
def body_callback(self, buf):
# This gets whole Tweets, and adds them to an array called tweets
if(buf.startswith('{"in_reply_to_status_id_str"')): # This is the start of a tweet
# Added Tweet to Global Array Tweets
print 'Added:' # Priniting output to console
print self.tweetBeingRead # Printing output to console
theTweetBeingProcessed = Tweet() # Create a new Tweet Object
theTweetBeingProcessed.set_raw(self.tweetBeingRead) # Set its raw value to tweetBeingRead
tweets.append(theTweetBeingProcessed) # Add it to the global array of tweets
# Start processing a new tweet
self.tweet = buf # Start a new tweet from scratch
else:
self.tweetBeingRead = self.tweetBeingRead+buf
if(how_many_tweets()>10):
try:
curling.close() # This is where the problem lays. I want to close the stream
except Exception as CurlError:
print ' Tried closing stream: ',CurlError
# Used to initiate the cURLing of the Data Sift streams
datastream = Stream()
curling = pycurl.Curl()
curling.setopt(curling.URL, apiURL)
curling.setopt(curling.HTTPHEADER, ['Authorization: '+base64.b64encode(userName+":"+password)])
curling.setopt(curling.WRITEFUNCTION, datastream.body_callback)
curling.perform() # This is cURLing starts
print 'I cant reach here.'
curling.close() # This never gets called. :(
1 个回答
4
你可以通过返回一个和传入的数字不一样的值来中止写入的回调函数。(默认情况下,返回'None'会被当作和传入的数字相同来处理)
当你中止这个操作时,整个传输会被认为已经完成,你的perform()调用也会正常返回。
不过,这个传输会返回一个错误,因为它被中止了。