如何在Python中初始化和填充列表的列表？

0 投票

5 回答

9305 浏览

提问于 2025-04-18 01:08

我想做的是把一些单词对象（这些对象包含一个扫描得到的单词、它的字母顺序版本和它的长度）按照长度分类到不同的列表中。所以，我先初始化了一个长度为0的列表，然后在处理输入文件的时候逐步扩展这个列表。我希望能有一个列表里面再嵌套一个列表，这样我的结果列表results[5]就能包含一个长度为5的单词列表。我该怎么做呢？

我首先这样初始化我的列表：

results = []

接着，我逐行扫描输入文件，创建临时的单词对象，并希望把它们放到合适的列表里：

try:    #check if there exists an array for that length
    results[lineLength]
except IndexError:  #if it doesn't, create it up to that length
    # Grow the list so that the new highest index is len(word)
    difference = len(results) - lineLength
    results.extend([] for _ in range(difference))
finally:
    results[lineLength].append(tempWordObject)

我觉得至少有以下几点需要修改：

(1) 初始化结果列表的方式

(2) 把对象添加到列表的方式

(3) 扩展列表的方式（不过我觉得这一部分是对的）

我使用的是Python 3.4。

编辑：

from sys import argv
main, filename = argv
file = open(filename)
for line in file:           #go through the file
    if line == '\n':        #if the line is empty (aka end of file), exit loop
        break
    lineLength = (len(line)-1)  #get the line length 
    line= line.strip('\r\n')

    if lineLength > maxL:       #keeps track of length of longest word encountered
        maxL = lineLength

    #note: I've written a mergesort algorithm in a separate area in the code and it works 
    tempAZ = mergesort(line)    #mergesort the word into alphabetical order
    tempAZ = ''.join(tempAZ)    #merges the chars back together to form a string

    tempWordObject = word(line,tempAZ,lineLength) #creates a new word object

    try:    #check if there exists an array for that length
        results[lineLength]
    except IndexError:  #if it doesn't, create it up to that length
        # Grow the list so that the new highest index is len(word)
        difference = len(results) - lineLength
        results.extend([] for _ in range(difference))
        print("lineLength: ", lineLength, "    difference:", difference)
    finally:
        results[lineLength].append(tempWordObject)

编辑：

这是我的单词类：

class word(object): #object class

    def __init__(self, originalWord=None, azWord=None, wLength=None):
        self.originalWord = originalWord
        self.azWord = azWord
        self.wLength = wLength

编辑：

这里是我想要实现的目标的进一步说明：当我遍历一个长度未知的单词列表（单词的长度也未知）时，我会创建包含单词、它的字母顺序版本和它的长度的单词对象（例如：dog, dgo, 3）。在遍历这个列表的过程中，我希望所有的对象都能放到一个嵌套在另一个列表中的列表（results[]）里，并且按照单词的长度进行索引。如果results[]中没有这样的索引（例如3），我希望扩展results[]并在results[3]中开始一个包含单词对象（dog, dgo, 3）的列表。最后，results[]应该包含按单词长度索引的单词列表。

字符串排序嵌套列表列表初始化数据分类索引管理列表扩展动态数组单词对象

5 个回答

你拒绝接受把你的对象存储在字典里的建议。不过你真正的问题是，你想把600万的包含扫描图像的单词放进你的内存里。你可以使用索引（或者一些简单的引用）来跟踪这些数据，然后根据这些索引来查找你的数据。使用迭代器来获取你需要的信息。

回答于 2025-04-18 由 Python大师

分享举报

与其使用一个列表，你可以使用一个字典：

d = {}

在这里，字典的键是长度，而对应的值是一个单词的列表：

if linelength not in d:
    d[linelength] = []
d[linelength].append(tempWordObject)

你还可以进一步简化，使用 d = collections.defaultdict(list)。

回答于 2025-04-18 由 Python大师

分享举报

你的差值是负数。你需要反过来减。还需要加一个额外的值，因为索引是从0开始的。

difference = lineLength - len(results) + 1

其实通常用一个 defaultdict 会更简单。

比如：

from collections import defaultdict
D = defaultdict(list)
for tempWordObject in the_file:
    D[len(tempWordObject)].append(tempWordObject)

回答于 2025-04-18 由 Python大师

分享举报

如果你决定使用列表（虽然这可能不是最好的选择），我觉得从一开始就把列表创建得足够大会更简单、更清晰。也就是说，如果最长的单词是5个字符长，你可以先创建这样一个列表：

output = [None, [], [], [], [], []]

这样做的好处是，你在处理过程中就不用担心会出现错误，但这要求你在开始之前就知道所有的单词。既然你已经创建了一个对象类来存储这些内容，我想你应该是确实在存储这些，所以这应该不是问题。

你总是需要在开头放一个None，这样索引才能对上。一旦你有了这个，你就可以遍历你的单词列表，像之前那样简单地把它们添加到合适的列表中。

for word in wordlist:
    output[len(word)].append(word)

具体来说，我建议你不要存储tempWordObject，而是创建一个列表（wordObjList

生成模板列表：


output = [None]
for i in range(maxLen):
    output.append([])

从你的word对象列表中填充这个列表。
for wordObj in wordObjList:
    output[wordObj.wLength].append(wordObj.originalWord)


还有一些其他需要注意的事项：

你不需要处理文件结束的情况。当Python在for循环中到达文件末尾时，它会自动停止迭代。
一定要确保你关闭文件。你可以使用with语句来做到这一点（with open("file.txt", 'r') as f:  for line in f:）。


                    
                        
                            回答于 2025-04-18 由 Python大师
                        
                        
                            分享
                            举报



            
                
                    

                        
                            
                            1
                            
                        

                        
                            关于你的问题，有三点需要注意。

嵌套列表的初始化
你在问题标题中提到了这个，虽然最后可能不需要。一个简单的方法是使用两个嵌套的列表推导式来实现：
import pprint

m, n = 3, 4  # 2D: 3 rows, 4 columns
lol = [[(j, i) for i in range(n)] for j in range(m)]

pprint.pprint(lol)
# [[(0, 0), (0, 1), (0, 2), (0, 3)],
#  [(1, 0), (1, 1), (1, 2), (1, 3)],
#  [(2, 0), (2, 1), (2, 2), (2, 3)]]

使用一些默认的数据结构
正如其他人提到的，你可以使用字典。特别是，collections.defaultdict可以让你在需要的时候自动初始化：
import collections

dd = collections.defaultdict(list)

for value in range(10):
    dd[value % 3].append(value)

pprint.pprint(dd)
# defaultdict(<type 'list'>, {0: [0, 3, 6, 9], 1: [1, 4, 7], 2: [2, 5, 8]})

比较自定义对象
内置的sorted函数有一个关键字参数key，可以用来比较那些没有提供排序功能的自定义对象：
import operator

class Thing:
    def __init__(self, word):
        self.word = word
        self.length = len(word)

    def __repr__(self):
        return '<Word %s>' % self.word

things = [Thing('the'), Thing('me'), Thing('them'), Thing('anybody')]
print(sorted(things, key=lambda obj: obj.length))
# [<Word me>, <Word the>, <Word them>, <Word anybody>] 


                        
                    
                    
                        
                            回答于 2025-04-18 由 Python大师
                        
                        
                            分享
                            举报
                        
                    
                

            

            
                撰写回答
                
                    
                        您的回答