嵌套的for循环搜索两个列表

1 投票
1 回答
1136 浏览
提问于 2025-04-16 22:54

使用:Python 2.4

目前,我有一个嵌套的循环,它会遍历两个列表,并根据两个列表中都存在的元素进行匹配。一旦找到匹配,就会从r120Final列表中取出这个元素,并放入一个叫“r120Delta”的新列表中:

for r120item in r120Final:
    for spectraItem in spectraFinal:
    if(str(spectraItem[0]) == r120item[2].strip()) and (str(spectraItem[25]) == r120item[10]):
        r120Delta.append(r120item)
        break

问题是,这个过程非常慢,而这两个列表的深度并不大。R120大约有64,000行,Spectra大约有150,000行。

r120Final列表是一个嵌套数组,它的结构大致是这样的:

r120Final[0] = [['xxx','xxx','12345','xxx','xxx','xxx','xxx','xxx','xxx','xxx','234567']]
...
r120Final[n] = [['xxx','xxx','99999','xxx','xxx','xxx','xxx','xxx','xxx','xxx','678901']]

spectraFinal列表基本上也是一个嵌套数组,结构大致是这样的:

spectraFinal[0] = [['12345','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','234567']]
...
spectraFinal[0] = [['99999','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','xxx','678901']]

最后,创建“r120Delta”的原因是为了让我能够对比r120Final和r120Delta,找出那些没有匹配的r120数据元素。这是我为这个任务定义的函数,但同样,执行速度很慢:

def listDiff( diffList, completeList ):
    returnList = []
        for completeItem in completeList:
            if not completeItem  in diffList:
                returnList.append(completeItem)
    return returnList

基本上,我对Python有一定了解,但绝对不是专家。我希望能找到一些专家来教我如何加快这个过程。任何帮助都非常感谢!

1 个回答

2
spectra_set = set((str(spectraItem[0]), str(spectraItem[25])) for spectraItem in spectraFinal)

returnList = []
for r120item in r120Final:
    if (r120item[2].strip(), r120item[10]) not in spectra_set:
       returnList.append(r120item)

这段代码会把所有没有匹配上的项目添加到 returnList 里。

如果你真的想的话,可以用一行代码来实现:

returnList = [r120item for r120item in r120Final 
                 if (r120item[2].strip(), r120item[10]) not in 
                     set((str(spectraItem[0]), str(spectraItem[25])) 
                         for spectraItem in spectraFinal)]

如果你需要完整的 spectraItem

spectra_dict = dict(((str(spectraItem[0]), str(spectraItem[25])), spectraItem) for spectraItem in spectraFinal)
returnList = []
for r120item in r120Final:
    key = (r120item[2].strip(), r120item[10])
    if key not in spectra_dict:
        returnList.append(r120item)
    else:
        return_item = some_function_of(r120item, spectra_dict[key])
        returnList.append(return_item)

撰写回答