必须捕获没有返回状态的函数的输出

3条回答

网友

1楼 · 编辑于 2024-05-23 15:57:02

TL；DR

demo_liu_hu_lexicon函数是演示如何使用opinion_lexicon的函数。用于测试，不应直接使用。在

很长时间内

让我们看看这个函数，看看如何重新创建一个类似的函数https://github.com/nltk/nltk/blob/develop/nltk/sentiment/util.py#L616

def demo_liu_hu_lexicon(sentence, plot=False):
    """
    Basic example of sentiment classification using Liu and Hu opinion lexicon.
    This function simply counts the number of positive, negative and neutral words
    in the sentence and classifies it depending on which polarity is more represented.
    Words that do not appear in the lexicon are considered as neutral.
    :param sentence: a sentence whose polarity has to be classified.
    :param plot: if True, plot a visual representation of the sentence polarity.
    """
    from nltk.corpus import opinion_lexicon
    from nltk.tokenize import treebank

    tokenizer = treebank.TreebankWordTokenizer()

好吧，导入存在于函数内部是一个奇怪的用法，但这是因为它是一个用于简单测试或文档的演示函数。在

而且，treebank.TreebankWordTokenizer()的用法相当奇怪，我们可以简单地使用nltk.word_tokenize。在

让我们移出导入并将demo_liu_hu_lexicon重写为simple_sentiment函数。在

^{pr2}$

接下来，我们看看

^{3}$

功能

第一个标记化的和小写的句子
初始化正负字数。在
x和y为以后的一些绘图而初始化，所以我们忽略它。在

如果我们进一步深入函数：

def demo_liu_hu_lexicon(sentence, plot=False):
    from nltk.corpus import opinion_lexicon
    from nltk.tokenize import treebank

    tokenizer = treebank.TreebankWordTokenizer()
    pos_words = 0
    neg_words = 0
    tokenized_sent = [word.lower() for word in tokenizer.tokenize(sentence)]

    x = list(range(len(tokenized_sent))) # x axis for the plot
    y = []

    for word in tokenized_sent:
        if word in opinion_lexicon.positive():
            pos_words += 1
            y.append(1) # positive
        elif word in opinion_lexicon.negative():
            neg_words += 1
            y.append(-1) # negative
        else:
            y.append(0) # neutral

    if pos_words > neg_words:
        print('Positive')
    elif pos_words < neg_words:
        print('Negative')
    elif pos_words == neg_words:
        print('Neutral')

循环只需遍历每个标记并检查单词是否在正/负词典中。
最后，它检查正负字数并返回标记。

现在让我们看看我们是否可以有一个更好的simple_sentiment函数，现在我们知道了demo_liu_hu_lexicon的作用。在

无法避免步骤1中的标记化，因此我们有：

from nltk.corpus import opinion_lexicon
from nltk.tokenize import treebank

def simple_sentiment(text):
    tokens = [word.lower() for word in word_tokenize(text)]

第2-5步有一个懒散的方法，就是复制并粘贴并更改print()->；return

from nltk.corpus import opinion_lexicon
from nltk.tokenize import treebank

def simple_sentiment(text):
    tokens = [word.lower() for word in word_tokenize(text)]

    for word in tokenized_sent:
        if word in opinion_lexicon.positive():
            pos_words += 1
            y.append(1) # positive
        elif word in opinion_lexicon.negative():
            neg_words += 1
            y.append(-1) # negative
        else:
            y.append(0) # neutral

    if pos_words > neg_words:
        return 'Positive'
    elif pos_words < neg_words:
        return 'Negative'
    elif pos_words == neg_words:
        return 'Neutral'

现在，你有一个功能，你可以做任何你想做的事。在

顺便说一句，这个演示真的很奇怪。。在

当我们看到一个正的单词加1，当我们看到一个否定的单词时，我们加-1。当pos_words > neg_words时，我们说某些东西是正的。在

这意味着整数列表的比较遵循一些可能没有语言或数学逻辑的python序列比较（参见What happens when we compare list of integers?）

网友

2楼 · 编辑于 2024-05-23 15:57:02

import sys
from io import StringIO

class capt_stdout:
    def __init__(self):
        self._stdout = None
        self._string_io = None

    def __enter__(self):
        self._stdout = sys.stdout
        sys.stdout = self._string_io = StringIO()
        return self

    def __exit__(self, type, value, traceback):
        sys.stdout = self._stdout

    @property
    def string(self):
        return self._string_io.getvalue()

这样使用：

^{pr2}$

网友

3楼 · 编辑于 2024-05-23 15:57:02

import sys
import io
from io import StringIO

stdout_ = sys.stdout
stream = StringIO()
sys.stdout = stream
demo_liu_hu_lexicon('PLACE YOUR TEXT HERE') 
sys.stdout = stdout_ 
sentiment = stream.getvalue()     
sentiment = sentiment[:-1]

TL；DR

很长时间内

相关问题更多 >

编程相关推荐

热门问题

热门文章