如何让我的代码在Python中检测字符串结尾？

2 投票

5 回答

16728 浏览

提问于 2025-04-18 06:46

我正在尝试写一段代码，把一个句子里的标点符号去掉，然后分开成单词。比如说，如果用户输入了 "Hello, how are you?"，我想把这个句子分成 ['hello','how','are','you'] 这样的格式。

userinput = str(raw_input("Enter your sentence: "))

def sentence_split(sentence):
    result = []
    current_word = ""
    for letter in sentence:
        if letter.isalnum(): 
            current_word += letter     
        else: ## this is a symbol or punctuation, e.g. reach end of a word
            if current_word: 
                result.append(current_word)
                current_word = "" ## reinitialise for creating a new word
    return result

print "Split of your sentence:", sentence_split(userinput)

到目前为止，我的代码是能工作的，但如果我输入一个没有标点符号结尾的句子，最后一个单词就不会出现在结果里。比如说，如果输入是 "Hello, how are you"，结果就会是 ['hello','how','are']。我想这可能是因为没有标点符号来告诉代码这个句子结束了。有没有办法让程序能检测到字符串的结束呢？这样即使输入是 "Hello, how are you"，结果也能是 ['hello','how','are','you']。

字符串处理输入验证代码逻辑文本分割结果输出标点符号

5 个回答

你代码的问题在于，最后你并没有对 current_word 做任何处理，除非你遇到一个不是字母或数字的字符：

for letter in sentence:
    if letter.isalnum():
        current_word += letter     
    else:
        if current_word: 
            result.append(current_word)
            current_word = ""
return result

如果最后一个字符是其他类型的字符，它会被加到 current_word 里，但 current_word 永远不会被添加到结果中。你可以通过在循环结束后重复添加的逻辑来解决这个问题：

for letter in sentence:
    if letter.isalnum():
        current_word += letter     
    else:
        if current_word: 
            result.append(current_word)
            current_word = ""

if current_word: 
    result.append(current_word)

return result

这样一来，当循环结束后 current_word 不是空的，它也会被添加到结果中。而如果最后一个字符确实是标点符号，那么 current_word 就会变为空，这样循环后面的 if 条件就不会成立了。

回答于 2025-04-18 由 Python大师

分享举报

因为这个算法希望每个单词后面都有标点符号或者空格，所以你可以在输入的最后加一个空格，这样可以确保算法能够正确结束：

userinput = str(raw_input("Enter your sentence: ")) + " "

结果：

Enter your sentence: hello how are you
Split of your sentence: ['hello', 'how', 'are', 'you']

回答于 2025-04-18 由 Python大师

分享举报

方法一：

为什么不直接用 re.split('[我不喜欢的字符列表]', s) 呢？

https://docs.python.org/2/library/re.html

方法二：

清理字符串（去掉不需要的字符）：

http://pastebin.com/raw.php?i=1j7ACbyK

然后用 s.split(' ') 来分割。

回答于 2025-04-18 由 Python大师

分享举报

你可以试试这样做：

def split_string(text, splitlist):
    for sep in splitlist:
        text = text.replace(sep, splitlist[0])
    return filter(None, text.split(splitlist[0])) if splitlist else [text]

如果你把 splitlist 设置为 "!?,." 或者你需要的任何分隔符，这段代码会先把每个标点符号替换成 splitlist 中的第一个分隔符，然后再用这个分隔符把整个句子分开，同时还会把返回的列表中的空字符串去掉（这就是 filter(None, list) 的作用）。

或者你也可以用这个简单的正则表达式解决方案：

>>> s = "Hello, how are you?"
>>> re.findall(r'([A-Za-z]+)', s)
['Hello', 'how', 'are', 'you']

回答于 2025-04-18 由 Python大师

分享举报

我自己没有尝试调整你的算法，但我觉得下面这个方法应该能达到你想要的效果。

def sentence_split(sentence):
    new_sentence = sentence[:]
    for letter in sentence:
        if not letter.isalnum():
            new_sentence = new_sentence.replace(letter, ' ')
    return new_sentence.split()

现在运行这个：

runfile(r'C:\Users\cat\test.py', wdir=r'C:\Users\cat')

['Hello', 'how', 'are', 'you']

补充：修复了新句子初始化时的一个错误。

回答于 2025-04-18 由 Python大师

分享举报

如何让我的代码在Python中检测字符串结尾？

5 个回答

撰写回答