如何从用户句子中提取单词列表？

0 投票

6 回答

6198 浏览

提问于 2025-04-16 14:15

我不太确定用户会输入什么，但我想把他们的输入句子拆分成一个单词列表。

User_input = raw_input("Please enter a search criterion: ")
User_Input_list[""]

# input example: steve at the office

# compiling the regular expression:
keyword = re.compile(r"\b[aA-zZ]\b")
     for word in User_input:
         User_Input_list.append(word?)

# going by thin put example input I'd want
# User_Input_list["steve", "at" , "the" , "office"]

我不知道怎么把输入拆分成单独的单词？我会给帮助的人发饼干！

文本处理自然语言处理词汇提取

6 个回答

User_input = raw_input("Please enter a search criterion: ")
User_Input_list = User_input.split(" ")

请查看：

http://docs.python.org/library/stdtypes.html

回答于 2025-04-16 由 Python大师

分享举报

最简单的解决办法可能就是用 split 方法：

>>> "steve at the office".split()
['steve', 'at', 'the', 'office']

不过，这样做不会去掉标点符号，这可能对你来说是个问题，也可能不是：

>>> "steve at the office.".split()
['steve', 'at', 'the', 'office.']

你可以使用 re.split() 来只提取字母：

>>> re.split('\W+', 'steve at the office.')
['steve', 'at', 'the', 'office', '']

但是正如你上面看到的，这样可能会出现空的结果，尤其是当你遇到更复杂的标点符号时，情况会更糟：

>>> re.split("\W+", "steve isn't at the office.")
['steve', 'isn', 't', 'at', 'the', 'office', '']

所以你可以在这里花点时间选择一个更合适的正则表达式，但你需要决定如何处理像 steve isn't at the 'the office' 这样的文本。

因此，要选择适合你的解决方案，你需要考虑一下你会得到什么样的输入，以及你想要什么样的输出。

回答于 2025-04-16 由 Python大师

分享举报

User_Input_list = User_input.split()

当然可以！请把你想要翻译的内容发给我，我会帮你把它变得简单易懂。

回答于 2025-04-16 由 Python大师

分享举报

如何从用户句子中提取单词列表？

6 个回答

撰写回答