将嵌套的JSON解析成多个由pandas python创建的数据框

{ "tableName": "cases", "url": "EndpointVoid", "tableDataList": [{ "_id": "100017252700", "title": "Test", "type": "TECH", "created": "2016-09-06T19:00:17.071Z", "createdBy": "193164275", "lastModified": "2016-10-04T21:50:49.539Z", "lastModifiedBy": "1074113719", "notes": [{ "id": "30", "title": "Multiple devices", "type": "INCCL", "origin": "D", "componentCode": "PD17A", "issueCode": "IP321", "affectedProduct": "134322", "summary": "testing the json", "caller": { "email": "katie.slabiak@spps.org", "phone": "651-744-4522" } }, { "id": "50", "title": "EDU: Multiple Devices - Lightning-to-USB Cable", "type": "INCCL", "origin": "D", "componentCode": "PD17A", "issueCode": "IP321", "affectedProduct": "134322", "summary": "parsing json 2", "caller": { "email": "testing1@test.org", "phone": "123-345-1111" } }], "syncCount": 2316, "repair": [{ "id": "D208491610", "created": "2016-09-06T19:02:48.000Z", "createdBy": "193164275", "lastModified": "2016-09-21T12:49:47.000Z" }, { "id": "D208491610" }, { "id": "D208491628", "created": "2016-09-06T19:03:37.000Z", "createdBy": "193164275", "lastModified": "2016-09-21T12:49:47.000Z" } ], "enterpriseStatus": "8" }], "dateTime": 1475617849, "primaryKeys": ["$._id"], "primaryKeyVals": ["100017252700"], "operation": "UPDATE"

1条回答

网友

1楼 · 发布于 2024-06-16 12:16:55

我不认为这是最好的方法，但我想向你们展示可能性。在

import pandas as pd
from pandas.io.json import json_normalize
import json

with open('your_sample.json') as f:    
    dt = json.load(f)

表1

^{pr2}$

表2

df2 = json_normalize(dt['tableDataList'], 'notes', '_id')
df2['phone'] = df2['caller'].map(lambda x: x['phone'])
df2['email'] = df2['caller'].map(lambda x: x['email'])
df2 = df2[['_id', 'id', 'title', 'email', 'phone']]
print df2


            _id  id                                           title  \
0  100017252700  30                                Multiple devices   
1  100017252700  50  EDU: Multiple Devices - Lightning-to-USB Cable   

                    email         phone  
0  katie.slabiak@spps.org  651-744-4522  
1       testing1@test.org  123-345-1111

表3

df3 = json_normalize(dt['tableDataList'], 'repair', '_id').dropna()
print df3


                    created  createdBy          id              lastModified  \
0  2016-09-06T19:02:48.000Z  193164275  D208491610  2016-09-21T12:49:47.000Z   
2  2016-09-06T19:03:37.000Z  193164275  D208491628  2016-09-21T12:49:47.000Z   

            _id  
0  100017252700  
2  100017252700

相关问题更多 >

编程相关推荐

热门问题

热门文章