如何在Python中遍历string语句？

2条回答

网友

1楼 · 编辑于 2024-06-06 23:45:49

我把这篇文章作为评论发表，但我想我还是把它充实成一个完整的答案，并给出一些解释：

您要使用^{}将字符串拆分为单词，然后对每个单词进行词干处理：

for word in text.split(" "):
    PorterStemmer().stem_word(word)

当你想把所有的词干串在一起时，把这些词干重新连接起来是很简单的。为了方便高效地完成这项工作，我们使用^{}和generator expression：

" ".join(PorterStemmer().stem_word(word) for word in text.split(" "))

编辑：

对于你的另一个问题：

with open("/path/to/file.txt") as f:
    words = set(f)

在这里，我们使用the ^{} statement打开文件（这是打开文件的最佳方式，因为它可以正确地关闭文件，即使在异常情况下也是如此，而且可读性更强），并将内容读取到一个集合中。我们使用一个集合，因为我们不关心单词的顺序，也不关心重复的单词，而且以后会更有效。我假设每行有一个单词-如果不是这样，并且它们是逗号分隔的，或者是空格分隔的，那么像我们之前做的那样（使用适当的参数）使用str.split()可能是一个不错的计划。

stems = (PorterStemmer().stem_word(word) for word in text.split(" "))
" ".join(stem for stem in stems if stem not in words)

在这里，我们使用生成器表达式的if子句忽略从文件加载的单词集中的单词。集合上的成员资格检查是O（1），因此这应该是相对有效的。

编辑2：

要在词干形成之前删除这些单词，更简单的是：

" ".join(PorterStemmer().stem_word(word) for word in text.split(" ") if word not in words)

删除给定的单词很简单：

filtered_words = [word for word in unfiltered_words if not in set_of_words_to_filter]

网友

2楼 · 编辑于 2024-06-06 23:45:49

要查看字符串中的每个单词：

for word in text.split():
    PorterStemmer().stem_word(word)

使用string的join方法（由Lattyware推荐）将片段连接到一个大字符串。

" ".join(PorterStemmer().stem_word(word) for word in text.split(" "))

相关问题更多 >

编程相关推荐

热门问题

热门文章