递归地将结构化文本解析/转换为字典

xyz1 : 14 xyz2 : 35 xyz3 : 14 xyz4 sub1_xyz4 sub1_sub1_xyz4 : 45 sub2_sub1_xyz4 : b1fawe sub2 xyz4 : 455 xyz5 : 2424

def parse_output(value, indent=0): parsed_dict = dict() if indent > 0: for i in re.split('\n(?!\s{,%d})' % (indent - 1), value): print("split value is: : ", i) if '\n' not in i: iter_val = iter(list(map(lambda x: x.strip(), re.split(' : ', i)))) parsed_dict = {**parsed_dict, **dict(zip(iter_val, iter_val))} else: parse_bearer_info(re.split('\n', i, 1)[1]) iter_val = iter(list(map(lambda x: x.strip(), re.split('\n', i, 1)))) parsed_dict = {**parsed_dict, **dict(zip(iter_val, iter_val))} else: for i in re.split('\n(?!\s+)', value): #print("iteration value is: ", i) if '\n' not in i: iter_val = iter(list(map(lambda x: x.strip(), re.split(' : ', i)))) parsed_dict = {**parsed_dict, **dict(zip(iter_val, iter_val))} else: #print(re.split('\n', i, 1)) #out = parse_bearer_info(re.split('\n', i, 1)[1], 4) iter_val = iter(list(map(lambda x: x.strip(), re.split('\n', i, 1)))) parsed_dict = {**parsed_dict, **dict(zip(iter_val, iter_val))} return parsed_dict

2条回答

网友

1楼 · 编辑于 2024-04-26 04:34:40

可以将itertools.groupby与递归一起使用：

import itertools, re, json
_data = [re.split('\s+:\s+', i) for i in filter(None, content.split('\n'))]
def group_data(d):
  _d = [[a, list(b)] for a, b in itertools.groupby(d, key=lambda x:bool(x[-1]) and not x[0].startswith(' '))]
  _new_result = {}
  for a, b in _d:
    if a:
      _new_result.update(dict([[c, _d] for c, [_d] in b]))
    else:
      _new_result[b[0][0]] = group_data([[c[2:], _d] for c, _d in b[1:]])
  return _new_result

print(json.dumps(group_data([[a, b] for a, *b in _data]), indent=4))

输出：

^{pr2}$

其中content是：

xyz1                      : 14
xyz2                      : 35
xyz3                      : 14
xyz4
  sub1_xyz4
    sub1_sub1_xyz4        : 45
    sub2_sub1_xyz4        : b1fawe
  sub2 xyz4               : 455
xyz5                      : 2424

网友

2楼 · 编辑于 2024-04-26 04:34:40

您可能可以递归地执行此操作，但由于您只需要跟踪单个缩进级别，所以只需保留当前对象的堆栈即可。将键添加到堆栈中的最后一项。当值为空时，添加新字典并将其推送到堆栈中。当缩进减少时，从堆栈中弹出。在

比如：

res = {}
stack = [res]
cur_indent = 0
for line in s.split('\n'):
    indent = len(line) - len(line.lstrip())
    if (indent < cur_indent):               # backing out
        stack.pop()
        cur_indent = indent
    else:
        cur_indent = indent

    vals = line.replace(" ", "").split(':')

    current_dict = stack[-1]
    if(len(vals) == 2):                    
        current_dict[vals[0]] = vals[1]
    else:                                   # no value, must be a new level
        current_dict[vals[0]] = {}
        stack.append(current_dict[vals[0]])

结果：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章