用Python解析Twitter JSON对象

2 投票
4 回答
19963 浏览
提问于 2025-04-17 15:45

我正在尝试从推特上下载推文。

我使用了Python和Tweepy来实现这个功能。不过我对Python和推特的API都还很陌生。

我的Python脚本如下: #!usr/bin/python

#import modules
import sys
import tweepy
import json

#global variables
consumer_key = ''
consumer_secret = ''
token_key = ''
token_secret = ''

#Main function
def main():
    print sys.argv[0],'starts'
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(token_key, token_secret)
    print 'Connected to Twitter'
    api = tweepy.API(auth)
    if not api.test():
        print 'Twitter API test failed'

    print 'Experiment with cursor'
    print 'Get search method returns json objects'

   json_search = api.search(q="football")
   #json.loads(json_search())
   print  json_search


#Standard boilerplate to call main function if this file runs

if __name__ == '__main__':
    main()

我得到的结果如下:

[<tweepy.models.SearchResult object at 0x9a0934c>, <tweepy.models.SearchResult object at 0x9a0986c>, <tweepy.models.SearchResult object at 0x9a096ec>, <tweepy.models.SearchResult object at 0xb76d8ccc>, <tweepy.models.SearchResult object at 0x9a09ccc>, <tweepy.models.SearchResult object at 0x9a0974c>, <tweepy.models.SearchResult object at 0x9a0940c>, <tweepy.models.SearchResult object at 0x99fdfcc>, <tweepy.models.SearchResult object at 0x99fdfec>, <tweepy.models.SearchResult object at 0x9a08cec>, <tweepy.models.SearchResult object at 0x9a08f4c>, <tweepy.models.SearchResult object at 0x9a08eec>, <tweepy.models.SearchResult object at 0x9a08a4c>, <tweepy.models.SearchResult object at 0x9a08c0c>, <tweepy.models.SearchResult object at 0x9a08dcc>]

现在我有点困惑,怎么从这些信息中提取推文呢?我尝试在这些数据上使用json.loads方法,但出现了错误,提示JSON需要字符串或缓冲区。 如果能给个示例代码就太好了。谢谢!

4 个回答

1

你可以使用JSON解析器来实现这个功能,下面是我在App Engine上处理JSONP响应的代码,这段代码可以和JQuery客户端一起使用:

import webapp2
import tweepy
import json
from tweepy.parsers import JSONParser

class APISearchHandler(webapp2.RequestHandler):
    def get(self):

        CONSUMER_KEY = 'xxxx'
        CONSUMER_SECRET = 'xxxx'
        ACCESS_TOKEN_KEY = 'xxxx'
        ACCESS_TOKEN_SECRET = 'xxxx'

        auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
        auth.set_access_token(ACCESS_TOKEN_KEY, ACCESS_TOKEN_SECRET)
        api = tweepy.API(auth, parser=JSONParser())

        # Query String Parameters
        qs = self.request.get('q')
        max_id = self.request.get('max_id')

        # JSONP Callback
        callback = self.request.get('callback')

        max_tweets = 100
        search_results = api.search(q=qs, count=max_tweets, max_id=max_id)
        json_str = json.dumps( search_results )

        if callback:
            response = "%s(%s)" % (callback, json_str)
        else:
            response = json_str

        self.response.write( response )

所以关键点是

api = tweepy.API(auth, parser=JSONParser())
8

Tweepy提供了更丰富的对象,它会帮你解析JSON数据。

SearchResult对象的属性和Twitter发送的JSON结构是一样的;你只需要查看一下推文文档,就能知道有哪些可用的属性:

for result in api.search(q="football"):
    print result.text

示例:

>>> import tweepy
>>> tweepy.__version__
'3.3.0'
>>> consumer_key = '<consumer_key>'
>>> consumer_secret = '<consumer_secret>'
>>> access_token = '<access_token>'
>>> access_token_secret = '<access_token_secret>'
>>> auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
>>> auth.set_access_token(access_token, access_token_secret)
>>> api = tweepy.API(auth)
>>> for result in api.search(q="football"):
...     print result.text
... 
Great moments from the Women's FA Cup http://t.co/Y4C0LFJed9
RT @freebets: 6 YEARS AGO TODAY: 

Football lost one of its great managers. 

RIP Sir Bobby Robson. http://t.co/NCo90ZIUPY
RT @Oddschanger: COMPETITION CLOSES TODAY!

Win a Premier League or Football League shirt of YOUR choice! 

RETWEET &amp; FOLLOW to enter. http…
Berita Transfer: Transfer rumours and paper review – Friday, July 31 http://t.co/qRrDIEP2zh [TS] #nobar #gosip
@ajperry18 im sorry I don't know this football shit
@risu_football おれモロ誕生日で北辰なんすよ笑
NFF Unveils Oliseh As Super Eagles Coach - SUNDAY Oliseh has been unveiled by the Nigeria Football... http://t.co/IOYajD9bi2 #Sports
RT @BilelGhazi: RT @lequipe : Gourcuff, au tour de Guingamp http://t.co/Dkio8v9LZq
@EDS_Amy HP SAUCE ?
RT @fsntweet: マンCの塩対応に怒りの炎!ベトナム人ファン、チケットを燃やして猛抗議 - http://t.co/yg5iuABy3K 

なめるなよ、プレミアリーグ!マンチェスターCのプレシーズンツアーの行き先でベトナム人男性が、衝撃的な行
RT @peterMwendo: Le football cest un sport collectif ou on doit se faire des passe http://t.co/61hy138yo8
RT @TSBible: 6 years ago today, football lost a true gentleman. Rest in Peace Sir Bobby Robson. http://t.co/6eHTI6UxaC
6 years ago today the greatest football manger of all time passed away SIR Bobby Robson a true Ipswich and footballing legend
The Guardian: PSG close to sealing £40m deal for Manchester United’s Ángel Di María. http://t.co/gAQEucRLZa
Sir Bobby Robson, the #football #legend passed away 6 years ago. 

#Barcelona #newcastle #Porto http://t.co/4UXpnvrHhS
0

这是我用tweepy写的代码:

def twitterfeed():
   auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
   auth.set_access_token(access_key, access_secret)
   api = tweepy.API(auth)
   statuses = tweepy.Cursor(api.home_timeline).items(20)
   data = [s.text.encode('utf8') for s in statuses]
   print data

撰写回答