如何转换嵌套字典?

2024-04-25 17:33:28 发布

您现在位置:Python中文网/ 问答频道 /正文

我想把API调用的结果转换成数据帧。 API调用的结果是一个嵌套字典,但是生成的数据帧不是我所需要的。你知道吗

除了规范化,我还尝试了pd.DataFrame.from\ U目录然而,直到现在都没有成功。我也试着把字典压平,但什么也没有。你知道吗

我使用了以下调用:

[73] results = requests.get(url).json()
results

结果是:

{'result': {'totalrows': 3124,
  'rows': [{'rownum': 1,
    'values': [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'},
     {'field': 'issueid', 'value': 472683},
     {'field': 'ticker', 'value': 'AAPL'},
     {'field': 'companyname', 'value': 'APPLE INC'},
     {'field': 'issuetitle', 'value': 'COM'},
     {'field': 'filerid', 'value': 1089387}]},
   {'rownum': 2,
    'values': [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'},
     {'field': 'issueid', 'value': 472683},
     {'field': 'ticker', 'value': 'AAPL'},
     {'field': 'companyname', 'value': 'APPLE INC'},
     {'field': 'issuetitle', 'value': 'COM'},
     {'field': 'filerid', 'value': 1086893}]},
   {'rownum': 3,
    'values': [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'},
     {'field': 'issueid', 'value': 472683},
     {'field': 'ticker', 'value': 'AAPL'},
     {'field': 'companyname', 'value': 'APPLE INC'},
     {'field': 'issuetitle', 'value': 'COM'},
     {'field': 'filerid', 'value': 1085803}]}

然后,为了生成数据帧,我使用了以下代码:


[74] Owners = results['result']['rows']
df1 = json_normalize(Owners)
df1.head()

这是输出:

  rownum    values
0   1      [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'}, 
           {'field': 'issueid', 'value': 472683}, {'field': 
           'ticker', 'value': 'AAPL'}, {'field': 'companyname', 
           'value': 'APPLE INC'}, {'field': 'issuetitle', 'value': 
           'COM'}, {'field': 'filerid', 'value': 1089387} 

1   2      [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'}, 
           {'field': 'issueid', 'value': 472683}, {'field': 
           'ticker', 'value': 'AAPL'}, {'field': 'companyname', 
           'value': 'APPLE INC'}, {'field': 'issuetitle', 'value': 
           'COM'}, {'field': 'filerid', 'value': 1086893}

2   3      [{'field': 'querydate', 'value': '7/31/2019 3:19 PM'}, {'field': 
           'issueid', 'value': 472683}, {'field': 'ticker', 'value': 'AAPL'}, 
           {'field': 'companyname', 'value': 'APPLE INC'}, {'field': 
           'issuetitle', 'value': 'COM'}, {'field': 'filerid', 'value': 1085803}

但是,我希望获得以下格式的数据帧:

Desired DataFrame format


Tags: 数据comfieldapplevalueinctickeraapl
2条回答

可以使用^{},但需要删除数据中所有不必要的数据。实际上,您只想保留每行的field值和value。您可以使用列表理解:

data = [{ field["field"]:field["value"] for field in row['values']
                    } for row in data['result']["rows"]]
print(data)
# [{'querydate': '7/31/2019 3:19 PM', 
#     'issueid': 472683, 
#     'ticker': 'AAPL', 
#     'companyname': 'APPLE INC',
#     'issuetitle': 'COM',
#     'filerid': 1089387},
# {
#     'querydate': '7/31/2019 3:19 PM',
#     'issueid': 472683,
#     'ticker': 'AAPL',
#     'companyname': 'APPLE INC',
#     'issuetitle': 'COM',
#     'filerid': 1086893},
# {
#     'querydate': '7/31/2019 3:19 PM', 
#     'issueid': 472683, 
#     'ticker': 'AAPL', 
#     'companyname': 'APPLE INC', 
#     'issuetitle': 'COM', 
#     'filerid': 1085803
# }]

一旦有了这个字典,就可以调用from_dict方法:

df = pd.DataFrame.from_dict(data)
print(df)
#   companyname  filerid  issueid issuetitle          querydate ticker
# 0   APPLE INC  1089387   472683        COM  7/31/2019 3:19 PM   AAPL
# 1   APPLE INC  1086893   472683        COM  7/31/2019 3:19 PM   AAPL
# 2   APPLE INC  1085803   472683        COM  7/31/2019 3:19 PM   AAPL

如果要将rownum作为列(或索引)获取:

data = [{**{field["field"]:field["value"] for field in row['values']}, **{'rownum': row["rownum"]}} for row in data['result']["rows"]]

df = pd.DataFrame.from_dict(data)
print(df)
#   companyname  filerid  issueid issuetitle          querydate  rownum ticker
# 0   APPLE INC  1089387   472683        COM  7/31/2019 3:19 PM       1   AAPL
# 1   APPLE INC  1086893   472683        COM  7/31/2019 3:19 PM       2   AAPL
# 2   APPLE INC  1085803   472683        COM  7/31/2019 3:19 PM       3   AAPL

朴素的嵌套for循环尝试。。。你知道吗

import pandas as pd

df = pd.DataFrame([])

for row in json["result"]["rows"]:
    rownum = row["rownum"]
    querydate = issueid = ticker = companyname = issuetitle = filerid = None
    for value_dict in row["values"]:
        if value_dict["field"] == "querydate":
            querydate = value_dict["value"]
        elif value_dict["field"] == "issueid":
            issueid = value_dict["value"]
        elif value_dict["field"] == "ticker":
            ticker = value_dict["value"]
        elif value_dict["field"] == "companyname":
            companyname = value_dict["value"]
        elif value_dict["field"] == "filerid":
            filerid = value_dict["value"]
    df = df.append(pd.DataFrame({"rownum": rownum,
                                 "querydate": querydate,
                                 "issueid": issueid,
                                 "ticker": ticker,
                                 "companyname": companyname,
                                 "issuetitle": issuetitle,
                                 "filerid": filerid,
                                }, index=[0]), ignore_index=True)

print(df)

JSON对象:

json = {
    "result": {
        "totalrows": 3,
        "rows": [
            {
                "rownum": 1,
                "values": [
                    {
                        "field": "querydate",
                        "value": "7/31/2019 3:19 PM"
                    },
                    {
                        "field": "issueid",
                        "value": 472683
                    },
                    {
                        "field": "ticker",
                        "value": "AAPL"
                    },
                    {
                        "field": "companyname",
                        "value": "APPLE INC"
                    },
                    {
                        "field": "issuetitle",
                        "value": "COM"
                    },
                    {
                        "field": "filerid",
                        "value": 1089387
                    }
                ]
            },
            {
                "rownum": 2,
                "values": [
                    {
                        "field": "querydate",
                        "value": "7/31/2019 3:19 PM"
                    },
                    {
                        "field": "issueid",
                        "value": 472683
                    },
                    {
                        "field": "ticker",
                        "value": "AAPL"
                    },
                    {
                        "field": "companyname",
                        "value": "APPLE INC"
                    },
                    {
                        "field": "issuetitle",
                        "value": "COM"
                    },
                    {
                        "field": "filerid",
                        "value": 1086893
                    }
                ]
            },
            {
                "rownum": 3,
                "values": [
                    {
                        "field": "querydate",
                        "value": "7/31/2019 3:19 PM"
                    },
                    {
                        "field": "issueid",
                        "value": 472683
                    },
                    {
                        "field": "ticker",
                        "value": "AAPL"
                    },
                    {
                        "field": "companyname",
                        "value": "APPLE INC"
                    },
                    {
                        "field": "issuetitle",
                        "value": "COM"
                    },
                    {
                        "field": "filerid",
                        "value": 1085803
                    }
                ]
            }
        ]
    }
}

输出:

   rownum          querydate  issueid ticker companyname issuetitle  filerid
0       1  7/31/2019 3:19 PM   472683   AAPL   APPLE INC        COM  1089387
1       2  7/31/2019 3:19 PM   472683   AAPL   APPLE INC        COM  1086893
2       3  7/31/2019 3:19 PM   472683   AAPL   APPLE INC        COM  1085803

相关问题 更多 >