从s3存储桶读取多个Json文件时出错

2024-04-26 02:20:47 发布

您现在位置:Python中文网/ 问答频道 /正文

当我将json文件作为键手动传递到python shell中加载时,它的工作状态良好。代码如下

import os
import json
import boto3
import io
import requests
import botocore
bucket_name = 'dev-data'
folder_name = 'raw/test/'
key_source  = 'raw/test/extract_api_20200719.json'
s3_client = boto3.client('s3')
json_obj = s3_client.get_object(Bucket=bucket_name, Key=key_source)
json_data = json_obj["Body"].read().decode('utf-8')
print("############################json_data####################### :", json_data )
print("############################json_data_type################## :", type(json_data))
json_dict = json.loads(json_data)
print("############################json_dict####################### :", json_dict )
print("############################json_dict_type ################# :", type(json_dict))

然而,当使用for循环从s3 bucket读取JSON对象时,我得到了错误

import os
import json
import boto3
import io
import requests
import botocore
bucket_name = 'dev-data'
folder_name = 'raw/test/'
s3_resource = boto3.resource('s3')
bucket = s3_resource.Bucket(bucket_name)
for obj in bucket.objects.filter(Prefix=folder_name):
    print('Object to extract :', obj)
    print('obj key: ', obj.key)
    s3_client = boto3.client('s3')
    json_obj = s3_client.get_object(Bucket=bucket_name, Key=obj.key)
    json_data = json_obj["Body"].read().decode('utf-8')
    json_dict = json.loads(json_data)
error:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Tags: keynameimportclientjsonobjdataraw
1条回答
网友
1楼 · 发布于 2024-04-26 02:20:47

bucket.objects中的某些条目在数据中没有任何JSON,因此请检查并跳过它们

for obj in bucket.objects.filter(Prefix=folder_name):
    print('Object to extract :', obj)
    print('obj key: ', obj.key)
    s3_client = boto3.client('s3')
    json_obj = s3_client.get_object(Bucket=bucket_name, Key=obj.key)
    json_data = json_obj["Body"].read().decode('utf-8')
    if not json_data:
        print("Skipping empty", obj.key)
        continue
    json_dict = json.loads(json_data)

相关问题 更多 >