如何在python中将一段文本修剪成完整的句子

2024-04-29 10:29:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一段文字不是完整的句子。比如说

reased 11%. Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged. Indus

完整的句子是

... increased 11%. Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged. Industry ...

我想要的是,如果句子被切掉,比如... increased 11%Industry...,那么我放弃它们,只返回完整的句子Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged.

我可以用nltk或spacy来做吗

更新

对不起,我在原来的帖子里没有把我的问题说清楚

可能有不同的情况:

  • Hey! How are you? Good!应返回Hey! How are you? Good!

  • ...ey. How are you? I am good. How about....应返回How are you? I am good.

我不知道课文中有多少完整的句子


Tags: yousearchare句子howtrafficwasacquisition
1条回答
网友
1楼 · 发布于 2024-04-29 10:29:36

您可以使用spacy获得字符串中的所有句子,例如

import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("I am good. How are you? Thank you.")
for sent in doc.sents:
    print(sent)

sents属性包含字符串中的所有句子

Output

I am good.
How are you?
Thank you.

对于使用

... increased by 11%. Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged. Industry ...

如果只获得完整的句子,只需将doc.sents放在list()方法中,并使用索引访问它。e、 g

doc = nlp("... increased 11%. Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged. Industry ...")
print(list(doc.sents)[1])

输出:

Search advertising revenue, excluding traffic acquisition costs, was relatively unchanged.

相关问题 更多 >