将嵌套的JSON解析成多个由pandas python创建的数据框

2024-06-16 12:16:55 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个嵌套的JSON,如下所示,并希望在python中解析成多个dataframe。。请帮忙

{
"tableName": "cases",
"url": "EndpointVoid",
"tableDataList": [{
    "_id": "100017252700",
    "title": "Test",
    "type": "TECH",
    "created": "2016-09-06T19:00:17.071Z",
    "createdBy": "193164275",
    "lastModified": "2016-10-04T21:50:49.539Z",
    "lastModifiedBy": "1074113719",
    "notes": [{
        "id": "30",
        "title": "Multiple devices",
        "type": "INCCL",
        "origin": "D",
        "componentCode": "PD17A",
        "issueCode": "IP321",
        "affectedProduct": "134322",
        "summary": "testing the json",

        "caller": {
            "email": "katie.slabiak@spps.org",
            "phone": "651-744-4522"
        }
    }, {
        "id": "50",
        "title": "EDU: Multiple Devices - Lightning-to-USB Cable",
        "type": "INCCL",
        "origin": "D",
        "componentCode": "PD17A",
        "issueCode": "IP321",
        "affectedProduct": "134322",
        "summary": "parsing json 2",
        "caller": {
            "email": "testing1@test.org",
            "phone": "123-345-1111"
        }
    }],
    "syncCount": 2316,
    "repair": [{
            "id": "D208491610",
            "created": "2016-09-06T19:02:48.000Z",
            "createdBy": "193164275",
            "lastModified": "2016-09-21T12:49:47.000Z"
        }, {
            "id": "D208491610"
        }, {
            "id": "D208491628",
            "created": "2016-09-06T19:03:37.000Z",
            "createdBy": "193164275",
            "lastModified": "2016-09-21T12:49:47.000Z"
        }

    ],
    "enterpriseStatus": "8"
}],
"dateTime": 1475617849,
"primaryKeys": ["$._id"],
"primaryKeyVals": ["100017252700"],
"operation": "UPDATE"

}

我想解析这个并创建3个表/dataframe/csv,如下所示。。请帮忙。。在

Output table in this format


Tags: iddataframetitletypeoriginmultiplecreatedcreatedby
1条回答
网友
1楼 · 发布于 2024-06-16 12:16:55

我不认为这是最好的方法,但我想向你们展示可能性。在

import pandas as pd
from pandas.io.json import json_normalize
import json

with open('your_sample.json') as f:    
    dt = json.load(f)

表1

^{pr2}$

表2

df2 = json_normalize(dt['tableDataList'], 'notes', '_id')
df2['phone'] = df2['caller'].map(lambda x: x['phone'])
df2['email'] = df2['caller'].map(lambda x: x['email'])
df2 = df2[['_id', 'id', 'title', 'email', 'phone']]
print df2


            _id  id                                           title  \
0  100017252700  30                                Multiple devices   
1  100017252700  50  EDU: Multiple Devices - Lightning-to-USB Cable   

                    email         phone  
0  katie.slabiak@spps.org  651-744-4522  
1       testing1@test.org  123-345-1111  

表3

df3 = json_normalize(dt['tableDataList'], 'repair', '_id').dropna()
print df3


                    created  createdBy          id              lastModified  \
0  2016-09-06T19:02:48.000Z  193164275  D208491610  2016-09-21T12:49:47.000Z   
2  2016-09-06T19:03:37.000Z  193164275  D208491628  2016-09-21T12:49:47.000Z   

            _id  
0  100017252700  
2  100017252700  

相关问题 更多 >