python3高效的大列表查找

all_word_prons = get_word_pron_pairs_in_cmu_dict() def pron_is_a_word(pronunciation): for word, pron in all_word_prons: if pronunciation == pron: return True else: return False

def build_fast_idict_tree(): from nltk.corpus import cmudict entries = cmudict.entries() idict = {} for entry in entries: word, pronunciation = entry idict_level = idict for syl in pronunciation: if syl not in idict_level: idict_level[syl] = {} idict_level = idict_level[syl] idict_level[0] = word return idict def get_fast_idict_tree(): filename = "fast_idict_tree.pickle" if os.path.isfile(filename): list = pickle.load(open(filename, "rb")) else: list = build_fast_idict_tree() pickle.dump(list, open(filename, "wb")) return list def lookup_in_fast_idict_tree(syls): idict = get_fast_idict_tree() for syl in syls: if syl not in idict: return False idict= idict[syl] return idict[0] if 0 in idict else False

3条回答

网友

1楼 · 编辑于 2024-04-24 14:23:11

如果我理解正确，您需要检查数据集中是否有pronunciation。从您的第一个代码块开始，您似乎并不关心匹配来自何处。你知道吗

因此，我认为我们可以：

pron_set = {tuple(pron) for word, pron in all_word_prons}
# do something to get a list of pronunciations to check
for pronunciation in pronunciation_list:
    if tuple(pronunciation) in pron_set:
        print('pronunctiation')

我们从tuple构造pron_set，因为list是不可散列的（不能用作集成员）。你知道吗

集合查找应该比遍历列表快得多。我建议您熟悉Python data structures；您永远不知道什么时候deque会为您节省大量时间。你知道吗

网友

2楼 · 编辑于 2024-04-24 14:23:11

您是否考虑过使用这里概述的Python列表理解？你知道吗

https://docs.python.org/3/tutorial/datastructures.html

在某些情况下，列表理解可能比普通for循环快，但是它仍然执行字节码级别的循环。如果你不确定我的意思，请检查以下线程： Are list-comprehensions and functional functions faster than "for loops"?

也许值得一试，看看这是否会更快。你知道吗

TL:博士

相关问题更多 >

编程相关推荐

热门问题

热门文章