NLTK中多次出现单词的索引

2 投票

2 回答

1588 浏览

数据工程师

提问于 2025-04-18 02:35

我正在尝试用Python来找到文本中单词'the'的索引位置。

sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']

如果我使用sent3.index('the')，我得到的结果是1，这表示'the'第一次出现的位置索引是1。但我不太确定如何找到'the'出现的其他位置。有没有人知道我该怎么做呢？

谢谢！

文本处理自然语言处理 nltk 词频分析单词索引

2 个回答

>>> from collections import defaultdict
>>> sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> idx = defaultdict(list)
>>> for i,j in enumerate(sent3):
...     idx[j].append(i)
... 
>>> idx['the']
[1, 5, 8]

当然可以！请把你想要翻译的内容发给我，我会帮你用简单易懂的语言解释清楚。

回答于 2025-04-18 由 Python大师

分享举报

[i for i, item in enumerate(sent3) if item == wanted_item]

示例：

>>> sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> [i for i, item in enumerate(sent3) if item == 'the']
[1, 5, 8]

enumerate 这个函数的作用是从一个可迭代的对象（比如列表）中创建一个包含元组的列表，每个元组里有两个部分：一个是值，另一个是对应的索引。我们可以用这个功能来检查值是否是我们想要的，如果是的话，就可以从中提取出索引。

回答于 2025-04-18 由 Python大师

分享举报

NLTK中多次出现单词的索引

2 个回答

撰写回答