按和/或将文本拆分为句子

2024-06-16 12:17:34 发布

您现在位置:Python中文网/ 问答频道 /正文

文本字符串:

text = ‘Turn left and take the door between stairs and elevator. Turn right to the corridor.’

期望输出:

splitted_sentences= [‘turn left’, ‘take the door between stairs and elevator’, ‘turn right to the corridor’]

如何将此文本拆分为Python的拆分句子列表中所示的句子


Tags: andtheto文本rightbetweenleftturn
1条回答
网友
1楼 · 发布于 2024-06-16 12:17:34
"I write a code similar to the desired output."
import re
from nltk.tokenize import RegexpTokenizer
text = 'Turn left and take the door between stairs and elevator. Turn right to the corridor.'
text = text.lower()
text = text.replace("and",",")
split1 = re.split('; |[.] |[:]|, |\* |\n',text)
tokenizer = RegexpTokenizer(r'\w+')
tokens = [tokenizer.tokenize(word) for word in split1]
d = []
i = 0
for t in tokens:
    for a in t:
        if a == 'between':
            m = tokens.index(t)
while i < m:
    d.append(tokens[i])
    i +=1
d.append(tokens[m]+['and']+tokens[m+1])
n = m+2
while n < len(tokens):
    d.append(tokens[n])
    n +=1
print(d)

相关问题 更多 >