使用python和json的instagram scrape

2024-05-31 23:50:00 发布

您现在位置:Python中文网/ 问答频道 /正文

import requests
import time
import json
from multiprocessing import Pool

with open('posts.json', 'r') as f:
    arr = json.loads(f.read())  # load json data from previous step
links = []
locations = []

def get_short_code(arr):
    for item in arr:
        shortcode = item['shortcode']
        links.append(shortcode)
    return links

def get_locations(shortcode):
    link = "https://www.instagram.com/p/{0}/?__a=1".format(shortcode)
    r = requests.get(link)
    data = json.loads(r.text)

    try:
        location_name = data["graphql"]["shortcode_media"]["location"]["name"]# get location for a post
        location_city = data['graphql']['shortcode_media']['location']['cityname'].split(',')[0]

    except :
        location_city =''
        location_name =''

    print(location_name)

    locations.append({'shortcode': shortcode, 'location_name': location_name, 'location_city': location_city})
    time.sleep(3)
    return locations

if __name__=='__main__':
    pool = Pool(processes=2)
    pool.map(get_locations, get_short_code(arr))
    if len(locations)%10 == 1:
        with open('locations.json', 'w') as outfile:
            json.dump(locations, outfile)  # save to json

我知道和时间一起使用游泳池是没有用的。睡眠,但我只是想看看它是如何工作的:)

我读过那些Instagram API,但我只是为了个人项目而下载了scrapr Instagram

我遇到的问题是,我得到了一个短代码列表,我用这段代码写了大约10篇文章,只得到了json或place。但如果我尝试使用这整段代码,我只会得到整段空白,为什么它会返回空结果


Tags: 代码namefromimportjsoncitydataget