python3模块用于标记英语句子。
tokenizesentences的Python项目详细描述
标记内容
python3模块用于标记英语句子。 根据D Greenberg在StackOverflow中的回答: https://stackoverflow.com/questions/4576077/python-split-text-on-sentences
安装
使用pip
安装pip3 install -U tokenizesentences
用法
In [1]: import tokenizesentences
In [2]: m = tokenizesentences.SplitIntoSentences()
In [3]: m.split_into_sentences(
"Mr. John Johnson Jr. was born in the U.S.A but earned his Ph.D. in Israel before joining Nike Inc. as an engineer. He also worked at craigslist.org as a business analyst."
)
Out[3]:
[
'Mr. John Johnson Jr. was born in the U.S.A but earned his Ph.D. in Israel before joining Nike Inc. as an engineer.',
'He also worked at craigslist.org as a business analyst.'
]