从文本文件中提取JSON密钥,然后创建HTTP请求

2024-04-16 22:18:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图从存储在文本文件(tweets.txt)中的JSON结果中提取一个特定的键(URL地址),然后用提取的URL地址创建httpget请求,HTTP响应应该保存为目录中的一个新HTML文件。我试图提取的字符串是特定JSON值的值

例如: “display\u url”:“test.com”(提取“test.com”,然后创建http请求)

我的代码:

import json
import requests
with open('tweets.txt') as input_file:
    for line in input_file:
        tweet_json = json.loads(line)
        response = requests.get(tweet_json.get('display_url')) if 'display_url' in tweet_json else {}
        if response and response.status_code()==200:
            print(response.html)

tweets.txt内容:

{"created_at":"Thu Nov 15 11:35:00 +0000 2018","id":15292802,"id_str":325802","text":"test8  https:\/\/t.co\/ZtCsuk7Ek2   #osining","source":"\u003ca href=\"http:\/\/twitter.com\" rel=\"nofollow\"\u003eTwitter Web Client\u003c\/a\u003e","truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":{"id":961508561217052675,"id_str":"961508561217052675","name":"Online S","screen_name":"osectraining","location":"Israel","url":"https:\/\/www.test.co.il","description":"test","translator_type":"none","protected":false,"verified":false,"followers_count":2,"friends_count":51,"listed_count":0,"favourites_count":0,"statuses_count":7,"created_at":"Thu Feb 08 07:54:39 +0000 2018","utc_offset":null,"time_zone":null,"geo_enabled":false,"lang":"en","contributors_enabled":false,"is_translator":false,"profile_background_color":"000000","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_link_color":"1B95E0","profile_sidebar_border_color":"000000","profile_sidebar_fill_color":"000000","profile_text_color":"000000","profile_use_background_image":false,"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/961510231346958336\/d_KhBeTD_normal.jpg","profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/961510231346958336\/d_KhBeTD_normal.jpg","profile_banner_url":"https:\/\/pbs.twimg.com\/profile_banners\/961508561217052675\/1518076913","default_profile":false,"default_profile_image":false,"following":null,"follow_request_sent":null,"notifications":null},"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"quote_count":0,"reply_count":0,"retweet_count":0,"favorite_count":0,"entities":{"hashtags":[{"text":"osectraining","indices":[33,46]}],"urls":[{"url":"https:\/\/t.co\/ZtCsuk7Ek2","expanded_url":"http:\/\/test.com","display_url":"test.com","indices":[7,30]}],"user_mentions":[],"symbols":[]},"favorited":false,"retweeted":false,"possibly_sensitive":false,"filter_level":"low","lang":"en","timestamp_ms":"1542281700508"}

Tags: toinhttpstestimagecomidjson
1条回答
网友
1楼 · 发布于 2024-04-16 22:18:51

我认为你的问题出在某一行。在open('tweets.txt').read()上执行json.loads,然后检查是否有display_url,如果有-使用tweet_json.get('display_url')发出请求

def send_req():
    import json
    import requests
    f = open('tweets.txt')
    tweet_json = json.loads(f.read())
    response = requests.get(tweet_json.get('display_url',None))
    if response and response.status_code == 200:
        print(response.html)
        return True
    else:
        return False

当这个函数的返回值为True时调用这个函数

import time
while not send_req():
    time.sleep(60) # seconds

相关问题 更多 >