加快执行速度,Python

2024-04-16 21:23:39 发布

您现在位置:Python中文网/ 问答频道 /正文

for循环的执行时间非常昂贵。我正在建立一个修正算法,我使用了彼得诺维格的拼写修正代码。我对它做了一点修改,意识到在几千个字上执行优化太长了。你知道吗

该算法检查1和2编辑距离并进行校正。我已经做了3次了。所以这可能会增加时间(我不确定)。以下是结尾的一部分,其中出现频率最高的单词用作参考:

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word]) # this is where the problem is

    candidate_new = []
    for candidate in candidates: #this statement isnt the problem
        if soundex(candidate) == soundex(word):
            candidate_new.append(candidate)
    return max(candidate_new, key=(NWORDS.get))

语句for candidate in candidates似乎在增加执行时间。您可以轻松地查看peter norvig的代码,单击here
我已经解决了问题。在声明里

candidates = (known([word]).union(known(edits1(word)))
             ).union(known_edits2(word).union(known_edits3(word)) or [word])

在哪里

def known_edits3(word):
    return set(e3 for e1 in edits1(word) for e2 in edits1(e1) 
                                      for e3 in edits1(e2) if e3 in NWORDS)  

可以看到,edits3中有3个for循环,这将执行时间增加了3倍。edits2有2个for循环。所以这就是罪魁祸首。你知道吗

如何最小化此表达式? 我能帮上忙吗??你知道吗


Tags: 代码in算法newfor时间candidateword
1条回答
网友
1楼 · 发布于 2024-04-16 21:23:39

这里有几种提高性能的方法:

  1. 使用列表理解(或生成器)
  2. 不要在每次迭代中计算相同的东西

代码将缩减为:

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])

    # Compute soundex outside the loop
    soundex_word = soundex(word)

    # List compre
    candidate_new = [candidate for candidate in candidates if soundex(candidate) == soundex_word]

    # Or Generator. This will save memory
    candidate_new = (candidate for candidate in candidates if soundex(candidate) == soundex_word)

    return max(candidate_new, key=(NWORDS.get))

另一个增强是基于这样一个事实,即您只需要MAX候选者

def correct(word):
    candidates = (known([word]).union(known(edits1(word)))).union(known_edits2(word).union(known_edits3(word)) or [word])

    soundex_word = soundex(word)
    max_candidate = None
    max_nword = 0
    for candidate in candidates:
        if soundex(candidate) == soundex_word and NWORDS.get(candidate) > max_nword:
            max_candidate = candidate
    return max_candidate

相关问题 更多 >