如何处理嵌套列表？

3 投票

3 回答

1760 浏览

提问于 2025-04-15 23:36

假设我有一个这样的项目符号列表：

* list item 1
* list item 2 (a parent)
** list item 3 (a child of list item 2)
** list item 4 (a child of list item 2 as well)
*** list item 5 (a child of list item 4 and a grand-child of list item 2)
* list item 6

我想把它解析成一个嵌套列表或者其他数据结构，这样可以清楚地表示元素之间的父子关系（而不是依赖它们的内容和相对位置）。比如，这里有一个包含项目和它的子项列表的元组列表（依此类推）：

编辑：希望能给出一个更准确的列表示例，其中列表中的每个元素都是一个元组，包含：一个项目符号的文本，以及如果有的话，一个子项列表（以相同的形式）。

    [('list item 1',),
     ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]
     ('list item 6',)]

[('list item 1',),
 ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]),
 ('list item 6',)]

我尝试用普通的Python和一些Pyparsing的实验来实现这个，但进展不大。我现在有两个主要问题：

我需要采用什么策略来让这个工作？我知道递归是解决方案的一部分，但我很难把它和斐波那契数列联系起来。
我相信我不是第一个遇到这个问题的人，但我不知道这个问题的术语，无法有效搜索更多相关信息。有哪些相关的问题可以让我更好地了解如何解决这类问题？

数据结构元组数据处理递归解析嵌套列表父子关系项目符号列表

3 个回答

记录你正在解析的当前“深度”。

如果下一行的深度比当前深度大，就用新的深度递归调用解析器，然后把这个调用的结果加到当前列表里。
如果下一行的深度和当前深度相等，就把它加到当前列表里。
如果下一行的深度比当前深度小，就返回当前列表。

回答于 2025-04-15 由 Python大师

分享举报

从搜索算法的角度来看，你提供的这个序列实际上是通过深度优先搜索（DFS）生成的。所以我的策略就是用这个深度优先搜索的序列来重建树的结构。

下面是用Python写的代码：

from collections import deque
def dfsBullet(bullet,depth):
    """
       parse the subtree with depth and the startnode of bullet[0]
    """
    li = []
    if depth != 0:
            item = bullet.popleft()
            li.append(item.split(' ',1)[1])
    while (len(bullet) != 0):
            item = bullet[0]
            #apply same algo to the child node
            if len(item.split(' ',1)[0]) > depth:
                    sublist = dfsBullet(bullet, len(item.split(' ')[0]))
            #we have traverse all childnode, so go back 
            else:
                    return li
            #add child tree to the list
            li.append(sublist)
    return li

回答于 2025-04-15 由 Python大师

分享举报

我看不懂你想要的结果——你的结果中似乎有更多的开括号而没有对应的闭括号，我也不明白其中的逻辑。

为了让树形结构更清晰，比如可以这样做：

data = '''* list item 1
* list item 2
** list item 3
** list item 4
*** list item 5
* list item 6'''.splitlines()

class Node(object):
  def __init__(self, payload):
    self.payload = payload
    self.children = []
  def show(self, indent):
    print ' '*indent, self.payload
    for c in self.children:
      c.show(indent+2)

def makenest(linelist):
  rootnode = Node(None)
  stack = [(rootnode, 0)]
  for line in linelist:
    for i, c in enumerate(line):
      if c != '*': break
    stars, payload = line[:i], line[i:].strip()
    curlev = len(stars)
    curnod = Node(payload)
    while True:
      parent, level = stack[-1]
      if curlev > level: break
      del stack[-1]
    # a child node of the current top-of-stack
    parent.children.append(curnod)
    stack.append((curnod, curlev))
  rootnode.show(0)

makenest(data)

这个 show 方法当然是为了验证解析字符串和创建树的部分是否正确。如果你能更准确地说明你想如何把树转换成嵌套的元组和列表，我相信在 Node 类中添加一个合适的方法（可能是递归的）会很简单——所以，你能提供这个缺失的说明吗…?

编辑: 由于提问者现在已经澄清了，正如我预料的那样，满足这个要求变得简单了。只需在 class Node 中添加以下方法：

  def emit(self):
    if self.children:
      return (self.payload,
              [c.emit() for c in self.children])
    else:
      return (self.payload,)

然后把代码的最后三行（makenest 的最后一行，一个空行，以及模块级别对 makenest 的调用）改成：

  return [c.emit() for c in rootnode.children]

print(makenest(data))

（在 print 后面的括号在 Python 2 中是多余的，但在 Python 3 中是必须的，所以我加上它们以防万一;-）。

通过这些小改动，我的代码现在可以按要求运行，输出：

[('list item 1',), ('list item 2', [('list item 3',), ('list item 4', [('list item 5',)])]), ('list item 6',)]

回答于 2025-04-15 由 Python大师

分享举报

如何处理嵌套列表？

3 个回答

撰写回答