如何从文本文件中提取带有经纬度的推文
我想用Python从Twitter中提取带有经纬度的推文,并把它们保存到一个文本文件里。
比如,我希望在提取的文本文件中看到以下内容:
[50.4146912, -119.2066755] 6 2011-08-28 19:24:29 @NaomiAKlein @TheRealRoseanne "BreakingNews: President Obama to deliver live statement on Hurricane Irene from Rose Garden - NBC News" [38.896544300000002, -76.994223250000005] 6 2011-08-28 19:26:31 RT @ProducerMatthew: President Obama to deliver statement at 2pm PT / 5pm ET on Hurricane #Irene from the Rose Garden. [33.787082099999999, -118.1678924] 6 2011-08-28 19:38:06 Ps. As the joke in itself is what ones know for ones selves as ones do to you yourselves to Obama self, ones government to the police [43.108731089999999, -89.335464060000007] 6 2011-08-28 19:46:44 “@crewislife: US Federal debt increases by U.S Presidents: Reagan 186% Bush I 54% Clinton 41% Bush II 72% Obama 23% Source: CBO #wiunion [43.108731089999999, -89.335464060000007] 6 2011-08-28 19:47:40 RT @crewislife: US Federal debt increases by U.S Presidents: Reagan 186% Bush I 54% Clinton 41% Bush II 72% Obama 23% Source: CBO #wiunion
1 个回答
2
这里有一个链接,可以查看关于Twitter REST API的文档。
接下来是一些基本信息,帮助你开始从Twitter获取数据:
import urllib2, json, pprint
u = urllib2.urlopen('http://search.twitter.com/search.json?q=obama&rpp=25')
resultdict = json.load(u)
pprint.pprint(resultdict)
for tweet in resultdict['results']:
print tweet['text']
需要注意的是,位置的经纬度并没有直接提供。Twitter会把位置转换成“地点代码”,你需要自己把这个代码反向解析一下:https://dev.twitter.com/terms/geo-developer-guidelines
剩下的部分就留给你自己去探索啦 :-)