使用Python创建D3嵌套JSON数据

2 投票
1 回答
1933 浏览
提问于 2025-04-18 00:47

我正在尝试写一个Python函数,把数据格式化成JSON字符串,以便D3使用。

我需要它的格式是:

{
 "name": "flare",
 "children": [
  {
   "name": "analytics",
   "children": [
    {
     "name": "cluster",
     "children": [
      {"name": "AgglomerativeCluster", "size": 3938},
      {"name": "CommunityStructure", "size": 3812},
      {"name": "HierarchicalCluster", "size": 6714},
      {"name": "MergeEdge", "size": 743}
     ]
    },

可以参考这个链接:http://bl.ocks.org/mbostock/4063550,适用于这个类型:http://johan.github.io/d3/ex/tree.html

到目前为止,我想出的数据结构是:

{'nlp':{'course':['course','range','topics','language','processing','word']}}

但我需要它的输出是:

{
   "name":"Natural Language Processing",
   "children":[
      {
         "name":"course",
         "children":[
            {
               "name":"course",
               "size":700
            },
            {
               "name":"range",
               "size":700
            },
            {
               "name":"topics",
               "size":700
            },
            {
               "name":"language",
               "size":700
            },
            {
               "name":"processing",
               "size":700
            },
            {
               "name":"word",
               "size":700
            }
         ]
      }
   ]
}

我开始尝试这个方法:

def format_d3_circle(data_input):
    d3_data = {};
    #root level
    d3_data['name'] = data_input[data_input.keys()[0]].keys()[0]
    sub_levels = data_input[data_input.keys()[0]]
    for level_one_key, level_one_data in sub_levels:
        d3_data['children'] = []
    return json.dumps(d3_data)

但是我发现自己似乎没有正确处理这个问题,想要有效地可视化出创建JSON节点的好方案有点困难。

有没有什么建议可以帮助我抽象这个问题,并从字典/列表/JSON输入等构建我需要的嵌套JSON结构?

1 个回答

2

这是我一直在研究的一个解决方案,它可以处理表格形式的数据,适用于任意层级的情况。

import pandas as pd
import json

def find_element(children_list,name):
    """
    Find element in children list
    if exists or return none
    """
    for i in children_list:
        if i["name"] == name:
            return i
    #If not found return None
    return None

def add_node(path,value,nest):
    """
    The path is a list.  Each element is a name that corresponds 
    to a level in the final nested dictionary.  
    """

    #Get first name from path
    this_name = path.pop(0)

    #Does the element exist already?
    element = find_element(nest["children"], this_name)

    #If the element exists, we can use it, otherwise we need to create a new one
    if element:

        if len(path)>0:
            add_node(path,value, element)

    #Else it does not exist so create it and return its children
    else:

        if len(path) == 0:
            nest["children"].append({"name": this_name, "value": value})
        else:
            #Add new element
            nest["children"].append({"name": this_name, "children":[]})

            #Get added element 
            element = nest["children"][-1]

            #Still elements of path left so recurse
            add_node(path,value, element)

下面是一个使用这个解决方案的例子。你需要告诉它哪些列用作层级,哪些列存储值。

df = pd.read_json('{"l1":{"0":"a","1":"a","2":"a","3":"a","4":"b","5":"b","6":"b","7":"b"},"l2":{"0":"a1","1":"a1","2":"a2","3":"a2","4":"b1","5":"b1","6":"b2","7":"b3"},"l3":{"0":"a11","1":"a12","2":"a21","3":"a22","4":"b11","5":"b12","6":"b22","7":"b34"},"val":{"0":1,"1":2,"2":3,"3":4,"4":5,"5":6,"6":7,"7":8}}')


d = {"name": "root",
"children": []}

levels = ["l1","l2", "l3"]
for row in df.iterrows():
    r = row[1]
    path = list(r[levels])
    value = r["val"]
    add_node(path,value,d)

print json.dumps(d, sort_keys=False,
              indent=2)

撰写回答