围绕特定单词的句子选择

2条回答

网友

1楼 · 编辑于 2024-05-15 13:33:17

你可以定义一个类似这样的函数

def find_sentences( word, text ):
    sentences = text.split('.')
    findings = []
    for i in range(len(sentences)):
        if word.lower() in sentences[i].lower():
            if i==0:
                findings.append( sentences[i+1]+'.' )
            elif i==len(sentences)-1:
                findings.append( sentences[i-1]+'.' )
            else:
                findings.append( sentences[i-1]+'.' + sentences[i+1]+'.' )
    return findings

这可以称为

findings = find_sentences( 'Power curve', Str_wrds )

用一些漂亮的印刷品

for finding in findings:
print( finding +'\n')

我们得到了结果

However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height.

Power curve, supplied by turbine manufacturers, are extensively used in condition monitoring, energy estimation, and improving operational efficiency. Data-driven model accuracy is significantly affected by uncertainty.

The uncertainty associated with models is quantified using confidence intervals (CIs), which are themselves estimated. A radial basis function is taken as the kernel function to improve the accuracy of the SVM models.

The proposed techniques are then verified by extensive 10 min average supervisory control and data acquisition (SCADA) data, obtained from pitch-controlled wind turbines..

我希望这就是你想要的：）

网友

2楼 · 编辑于 2024-05-15 13:33:17

当你说

I would like to select before and after 1 sentence whenever Test_wrds found it in a paragraph and list them as a separate string.

我猜你的意思是，所有句子中有一个单词在Test_wrds中，在它们前面的句子，在它们后面的句子，也应该被选择

作用

def remove_sentence(Str_wrds: str, Test_wrds):
    # store all selected sentences
    all_selected_sentences = {}
    # initialize empty dictionary
    for k in Test_wrds:
        # one element for each occurrence
        all_selected_sentences[k] = [''] * Str_wrds.lower().count(k.lower())

    # list of sentences
    sentences = Str_wrds.split(".")

    word_counter = {}.fromkeys(Test_wrds,0)

    for i, sentence in enumerate(sentences):
        for j, word in enumerate(Test_wrds):
            # case insensitive
            if word.lower() in sentence.lower():
                if i == 0:  # first sentence
                    chosen_sentences = sentences[0:2]
                elif i == len(sentences) - 1:  # last sentence
                    chosen_sentences = sentences[-2:]
                else:
                    chosen_sentences = sentences[i - 1:i + 2]

                # get which occurrence of the word is it
                k = word_counter[word]

                all_selected_sentences[word][k] += '.'.join(
                    [s for s in chosen_sentences
                        if s not in all_selected_sentences[word][k]]) + "."

                word_counter[word] += 1  # increment the word counter

    return all_selected_sentences

运行这个

answer = remove_sentence(Str_wrds, Test_wrds)
print(answer)

使用为Str_wrds和Test_wrds提供的值，返回此输出

{
    'Power curve': [
        'Power curve, supplied by turbine manufacturers, are extensively used in condition monitoring, energy estimation, and improving operational efficiency. However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height.',
        'Power curve, supplied by turbine manufacturers, are extensively used in condition monitoring, energy estimation, and improving operational efficiency. However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height. Data-driven model accuracy is significantly affected by uncertainty.',
        ' The uncertainty associated with models is quantified using confidence intervals (CIs), which are themselves estimated. This study proposes two approaches, namely, pointwise CIs and simultaneous CIs, to measure the uncertainty associated with an SVM-based power curve model. A radial basis function is taken as the kernel function to improve the accuracy of the SVM models.',
        ' The proposed techniques are then verified by extensive 10 min average supervisory control and data acquisition (SCADA) data, obtained from pitch-controlled wind turbines. The results suggest that both proposed techniques are effective in measuring SVM power curve uncertainty, out of which, pointwise CIs are found to be the most accurate because they produce relatively smaller CIs.'
    ],
    'data-driven': [
        ' However, there is substantial uncertainty linked to power curve measurements as they usually take place only at hub height. Data-driven model accuracy is significantly affected by uncertainty. Therefore, an accurate estimation of uncertainty gives the confidence to wind farm operators for improving performance/condition monitoring and energy forecasting activities that are based on data-driven methods.',
        ' Data-driven model accuracy is significantly affected by uncertainty. Therefore, an accurate estimation of uncertainty gives the confidence to wind farm operators for improving performance/condition monitoring and energy forecasting activities that are based on data-driven methods. The support vector machine (SVM) is a data-driven, machine learning approach, widely used in solving problems related to classification and regression.',
        ' Therefore, an accurate estimation of uncertainty gives the confidence to wind farm operators for improving performance/condition monitoring and energy forecasting activities that are based on data-driven methods. The support vector machine (SVM) is a data-driven, machine learning approach, widely used in solving problems related to classification and regression. The uncertainty associated with models is quantified using confidence intervals (CIs), which are themselves estimated.'
    ],
    'wind turbines': [
        ' A radial basis function is taken as the kernel function to improve the accuracy of the SVM models. The proposed techniques are then verified by extensive 10 min average supervisory control and data acquisition (SCADA) data, obtained from pitch-controlled wind turbines. The results suggest that both proposed techniques are effective in measuring SVM power curve uncertainty, out of which, pointwise CIs are found to be the most accurate because they produce relatively smaller CIs.'
    ]
}

注:

函数返回列表
每个关键字都是Test_wrds中的一个单词，列表元素是该单词的一个匹配项
例如，由于单词“功率曲线”在整个文本中出现4次，因此输出中“功率曲线”的值是4个元素的列表

作用

注:

相关问题更多 >

编程相关推荐

热门问题

热门文章

围绕特定单词的句子选择

作用

注:

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >