Python将Json行加载到数据帧

2024-05-12 17:33:22 发布

您现在位置:Python中文网/ 问答频道 /正文

请施展你的魔法

我有这样的json文件格式

{"_id":{"$oid":"5a8fda432467a7bb10swec"},"code":4321,"phone":{"$numberLong":"32323232"},"name":"Batako","fax":{"$numberLong":"12345678"}}
{"_id":{"$oid":"7dhds9ds9dsa9dsa9sdsds"},"code":3212,"phone":"","name":"Franco","fax":0}
{"_id":{"$oid":"6dhds9dadssa9dsa9sdsds"},"code":5612,"phone":"6483737","name":"Brescia","fax":"123-232-1331"}
{"_id":{"$oid":"8dshds9ds9dsa9dsa9sdsds"},"code":4312,"phone":{"$numberLong":"9453737"},"name":"Kalon","fax":{"$numberLong":"65543434"}}

如何使用熊猫创建数据帧

我一直在这样尝试

import pandas as pd
import json
data = []
for line in open(r'file.json', 'r', encoding='utf-8'):
    data.append(json.loads(line))
    
df = pd.json_normalize(data)
df.head()

但是得到了错误

JSONDecodeError: Expecting value: line 1 column 133 (char 132)

Tags: nameimportidjsondfdata魔法line
1条回答
网友
1楼 · 发布于 2024-05-12 17:33:22

如果文件的每一行都包含json字符串,并且某些值是只有一个值的字典,则可以尝试以下示例将其加载到dataframe:

df = pd.read_json('<your file>', lines=True)

def unpack(x):
    rv = []
    for v in x:
        if isinstance(v, dict):
            rv.append([*v.values()][0])
        else:
            rv.append(v)
    return rv

df = df.apply(unpack)
print(df)

印刷品:

                      _id  code     phone    name       fax
0  5a8fda432467a7bb10swec  4321  32323232  Batako  12345678
1  7dhds9ds9dsa9dsa9sdsds  3212   9283737  Franco  65543434

编辑:要忽略json抛出错误的行,可以使用以下示例:

import json

data = []
with open('a1.txt', 'r') as f_in:
    for line in f_in:
        try:
            data.append(json.loads(line))
        except:
            continue    # ignore the error

df = pd.DataFrame(data)

def unpack(x):
    rv = []
    for v in x:
        if isinstance(v, dict):
            rv.append([*v.values()][0])
        else:
            rv.append(v)
    return rv

df = df.apply(unpack)
print(df)

印刷品:

                       _id  code     phone     name           fax
0   5a8fda432467a7bb10swec  4321  32323232   Batako      12345678
1   7dhds9ds9dsa9dsa9sdsds  3212             Franco             0
2   6dhds9dadssa9dsa9sdsds  5612   6483737  Brescia  123-232-1331
3  8dshds9ds9dsa9dsa9sdsds  4312   9453737    Kalon      65543434

相关问题 更多 >