在python中解析json数据时如何解析多个索引值并创建csv文件

2024-04-25 14:48:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我有几个静态键列EmployeeId、type和几个来自first FOR循环的列。你知道吗

而在第二个FOR循环中,如果我有一个特定的键,那么只有值应该附加到现有的数据帧列,否则从第一个FOR循环获取的列应该保持不变。你知道吗

第一个For循环输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","","",""

在第二个For循环之后,我有以下输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","AMAZON","1",""
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","FLIPKART","2",""

根据代码,如果我有Employee标记可用,我有以上2条记录,但我可能有几个没有Employee标记的json文件,那么输出应该与第一个循环输出相同。你知道吗

但是根据我的代码我得到了0条记录。如果我的编码方式不对,请帮助我。你知道吗

真的很抱歉——如果问问题的方式不清楚的话,因为我对python是新手。请在下面的超链接中找到代码

请查找以下代码

    for i in range(len(json_file['enty'])):
        temp = {}
        temp['EmployeeId'] = json_file['enty'][i]['id']
        temp['type'] = json_file['enty'][i]['type']
        for key in json_file['enty'][i]['data']['attributes'].keys():        
            try:
                temp[key] = json_file['enty'][i]['data']['attributes'][key]['values'][0]['value']
            except:
                temp[key] = None      

        for key in json_file['enty'][i]['data']['attributes'].keys(): 
            if(key == 'Employee'):
                for j in range(len(json_file['enty'][i]['data']['attributes']['Employee']['group'])):
                    for key in json_file['enty'][i]['data']['attributes']['Employee']['group'][j].keys():
                        try:
                            temp[key] = json_file['enty'][i]['data']['attributes']['Employee']['group'][j][key]['values'][0]['value']
                        except:
                            temp[key] = None

                    temp_df = pd.DataFrame([temp])
                    df = pd.concat([df, temp_df], sort=True)

    # Rearranging columns
    df = df[['EmployeeId', 'type'] + [col for col in df.columns if col not in ['EmployeeId', 'type']]]

    # Writing the dataset
    df[columns_list].to_csv("Test22.csv", index=False, quotechar='"', quoting=1)

如果Employee标记不可用,我将得到0条记录作为输出,但我希望first FOR循环的每个输出有1条记录。如果“EmployeeId”、“type”、“KeyColumn”、“Start”、“End”静态列有2条记录,如果“EmployeeId”、“type”、“KeyColumn”、“Start”、“End”静态列不可用,则所有静态列“EmployeeId”、“type”、“KeyColumn”、“Start”、“End”和其余列为空白

enter link description here


Tags: keyinjsondffordatatypeemployee
1条回答
网友
1楼 · 发布于 2024-04-25 14:48:52

修改代码、添加一个循环、更改索引以及修改range参数的长期解决方案:

df = pd.DataFrame()

num = max([len(v) for k,v in json_file['data'][0]['data1'].items()])
for i in range(num):
    temp = {}
    temp['Empid'] = json_file['data'][0]['Empid']
    temp['Empname'] = json_file['data'][0]['Empname']
    for key in json_file['data'][0]['data1'].keys():
        if key not in temp:
            temp[key] = []
        try:
            for j in range(len(json_file['data'][0]['data1'][key])):
                temp[key].append(json_file['data'][0]['data1'][key][j]['relative']['id']) 
        except:
            temp[key] = None                    
    temp_df = pd.DataFrame([temp])
    df = pd.concat([df, temp_df],ignore_index=True)
for i in json_file['data'][0]['data1'].keys():
    df[i] = pd.Series([x for y in df[i].tolist() for x in y]).drop_duplicates()

现在:

print(df)

是:

  Empid Empname    XXXX   YYYYY
0  1234     ABC  Naveen   Kumar
1  1234     ABC     NaN  Rajesh

相关问题 更多 >