从推文中过滤图像

4 投票

1 回答

6232 浏览

提问于 2025-04-18 06:16

我刚接触tweepy，想知道怎么能找到并保存用户在推文中发布的图片。我在一些教程中找到了获取用户推文的方法，但没找到只筛选出图片的方法。

我正在使用以下代码来获取用户的推文。请问怎么才能只获取用户的图片呢？

编辑：我把我的代码修改成了上面的样子：

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(OAUTH_TOKEN, OAUTH_SECRET)
api = tweepy.API(auth)
timeline = api.user_timeline(count=10, screen_name = "zenitiss") 
for tweet in timeline: 
   for media in tweet.entities.get("media",[{}]):
      print media
      #checks if there is any media-entity
      if media.get("type",None) == "photo":
          # checks if the entity is of the type "photo"
          image_content=requests.get(media["media_url"])
          print image_content

不过，似乎这个for循环没有正常工作。打印媒体的那一行输出的是一个空对象。基本上，当我尝试打印一个用户（比如karyperry）的链接时，我得到的是：

{u'url': u'http://t.co/TaP2JZrpxu', u'indices': [42, 64], u'expanded_url':  
u'http://youtu.be/7bDLIV96LD4', u'display_url': u'youtu.be/7bDLIV96LD4'}
{u'url': u'https://t.co/t3hv7VQiPG', u'indices': [42, 65], u'expanded_url': 
u'https://vine.co/v/MgvxZA2qKbV', u'display_url': u'vine.co/v/MgvxZA2qKbV'}
{u'url': u'http://t.co/vnJAAU7KN6', u'indices': [50, 72], u'expanded_url':
u'http://instagram.com/p/n01XZjv-fp/', u'display_url': u'instagram.com/p/n01XZjv-fp/'}
{u'url': u'http://t.co/NycqAwtcgo', u'indices': [78, 100], u'expanded_url':
u'http://bit.ly/1o7xQRj', u'display_url': u'bit.ly/1o7xQRj'}
{u'url': u'http://t.co/BG6ozuRD6D', u'indices': [111, 133], u'expanded_url':
u'http://www.johnnywujek.com/sos', u'display_url': u'johnnywujek.com/sos'}
{u'url': u'http://t.co/nWIQ9ruJ3f', u'indices': [88, 110], u'expanded_url':
u'http://uncf.us/1kSXIwF', u'display_url': u'uncf.us/1kSXIwF'}
{u'url': u'http://t.co/yTbOgqt9fw', u'indices': [101, 123], u'expanded_url':
u'http://instagram.com/p/nvxD8eP-SZ/', u'display_url': u'instagram.com/p/nvxD8eP-SZ/'}

大部分链接都是图片，但是当我在循环中把'media'换成'url'时，tweet.entities.get("url",[{}])，大部分也是图片链接。

数据提取编程调试图像处理 twitter api 推文分析 tweepy 媒体筛选媒体链接

1 个回答

推文（它们的JSON表示）包含一个“媒体”实体，具体内容可以在这里找到。Tweepy应该以以下方式展示这种类型的实体，假设推文中包含了一张图片：

tweet.entities["media"]["media_url"]

所以，如果你想保存这张图片，你只需要下载它，比如通过Python的请求库。你可以尝试在你的代码中添加类似下面的语句（或者根据你的需求进行修改）：

for media in tweet.entities.get("media",[{}]):
    #checks if there is any media-entity
    if media.get("type",None) == "photo":
        # checks if the entity is of the type "photo"
        image_content=requests.get(media["media_url"])
        # save to file etc.

回答于 2025-04-18 由 Python大师

分享举报

从推文中过滤图像

1 个回答

撰写回答