通过存储100个用户降低Twitter API负载，然请求信息并与其他信息并列 -Python

Question

我知道这个标题很长——我想不出其他的了；)

我正在写一个Python脚本，这个脚本会把Twitter搜索API的结果保存到一个csv文件里。

writer = csv.writer(open('stocks.csv', 'a', buffering=0))
writer.writerows([(screen_name, hashtags, expanded_url , coordinates , geo , in_reply_to_user_id, followers)])

但是我还想加上发推用户的粉丝数量！

这个可以通过Twitter的GET users/lookup API来实现，不过它每小时只能请求350次，但一次可以同时查找最多100个用户。

现在我的脚本在找到一条推文时，会查找用户的粉丝数量，并把这些信息连同推文的其他信息一起保存到csv文件里。

这个方法很好用，但当我搜索到350次后就达到了限制！！

所以我想问的是：我能不能让脚本一次查找100个用户，并把这100个用户名存起来？等到达到100个后，再调用GET users/lookup，把这些信息插入到搜索结果旁边的Excel文件里？

Excel示例：

 [info from search ...(in many columns)] [followers of the user who sent the tweet]
 [info from search ...(in many columns)] [followers of the user who sent the tweet]
 [info from search ...(in many columns)] [followers of the user who sent the tweet]

根据请求：

import urllib2
import urllib
import json
import time


 s = u'@apple OR @iphone OR @aapl OR @imac OR @ipad OR @mac OR @macbook OR macbook OR mac OR ipad OR iphone 4s OR iphone 5 OR @iphone4s OR @ iphone 5 OR aapl OR iphone'

info =  urllib2.quote(s.encode("utf8"))
page = "?q="

 openurl = urllib.urlopen("http://search.twitter.com/search.json"+ page + info)

quota = 150
user = 'twitter'
user_info = urllib.urlopen("https://api.twitter.com/1/users/lookup.json?screen_name="+user)

while quota > 10:
 openurl2 = urllib.urlopen("https://api.twitter.com/1/account/rate_limit_status.json")
 twitter_quota = openurl2.read()
 quota_json = json.loads(twitter_quota)
 quota = quota_json['remaining_hits']


 twitter_search = openurl.read()

 table_search = json.loads(twitter_search)
 print table_search

 print str(table_search[u'results'][1][u'iso_language_code'])


 lines = 0

 linesmax = len(table_search[u'results'])
 print linesmax

 while lines < linesmax:
    table_timeline_inner = table_search[u'results'][lines]


    next = table_search[u'next_page']
    lang = table_timeline_inner[u'iso_language_code']
    to = table_timeline_inner[u'to_user_name']
    text = table_timeline_inner[u'text']
    user = table_timeline_inner[u'from_user']
    geo = table_timeline_inner[u'geo']
    time = table_timeline_inner[u'created_at']
    result_type = table_timeline_inner[u'metadata'][u'result_type']
    id = table_timeline_inner[u'id']

数据处理数据存储自动化脚本 csv 文件 twitter api 请求限制用户查找 excel 集成

通过存储100个用户降低Twitter API负载，然请求信息并与其他信息并列 -Python

1 个回答

撰写回答