计算精度和查全率的结果似乎有些奇怪

2024-04-19 04:09:21 发布

男 | 程序猿一只，喜欢编程写python代码。

我在模拟一个搜索引擎，它可以检索10个文档，但只有5个文档是相关的。你知道吗

from sklearn import svm, datasets
from sklearn.model_selection import train_test_split
import numpy as np
from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt
from sklearn.metrics import average_precision_score
from sklearn.metrics import roc_curve
from sklearn.metrics.ranking import _binary_clf_curve

y_true = np.array([True, True, False, True, False, True, False, False, False, True])

降低阈值以获取更多文档：

y_scores = np.array([1, .9, .8, .7, .6, .5, .4, .3, .2, .1])

现在得到精度、召回和阈值：

precisions, recalls, thresholds1 = precision_recall_curve(y_true, y_scores)

print("\nPresicions:")
for pr in precisions:
    print('{0:0.2f}'.format(pr), end='; ')

print("\nRecalls:")
for rec in recalls:
    print('{0:0.2f}'.format(rec), end='; ')

print("\nThresholds:")
for thr in thresholds1:
    print('{0:0.2f}'.format(thr), end='; ')

输出1

Presicions:
0.50; 0.44; 0.50; 0.57; 0.67; 0.60; 0.75; 0.67; 1.00; 1.00; 1.00;
Recalls:
1.00; 0.80; 0.80; 0.80; 0.80; 0.60; 0.60; 0.40; 0.40; 0.20; 0.00;
Thresholds:
0.10; 0.20; 0.30; 0.40; 0.50; 0.60; 0.70; 0.80; 0.90; 1.00;

案例2的输出代码：

falsePositiveRates, truePositiveRates, thresholds2 = roc_curve(y_true, y_scores, pos_label = True)

print("\nFPRs:")
for fpr in falsePositiveRates:
    print('{0:0.2f}'.format(fpr), end='; ')

print("\nTPRs:")
for tpr in truePositiveRates:
    print('{0:0.2f}'.format(tpr), end='; ')

print("\nThresholds:")
for thr in thresholds2:
    print('{0:0.2f}'.format(thr), end='; ')

输出2

FPRs:
0.00; 0.00; 0.20; 0.20; 0.40; 0.40; 1.00; 1.00;
TPRs:
0.20; 0.40; 0.40; 0.60; 0.60; 0.80; 0.80; 1.00;
Thresholds:
1.00; 0.90; 0.80; 0.70; 0.60; 0.50; 0.20; 0.10;

问题在输出1中，为什么最后一个精度（将是绘图上的第一个精度）计算为1而不是0？你知道吗

在输出2中，为什么FPR、TPR和阈值的长度是8而不是10？你知道吗

Tags： in from 文档 import false true format for

1条回答

网友

1楼 · 发布于 2024-04-19 04:09:21

In output1 why the last precision (which will be the 1st on plot) is set to 1 instead of 0?

在最严格的阈值下，您只选择一个相关的项目（真正）。你知道吗

In output2 why counts of FPR, TPR, Threshold are 8 instead of 10

您允许drop\u intermediate默认为True。0.3和0.4是次优阈值。你知道吗

计算精度和查全率的结果似乎有些奇怪

相关问题更多 >

编程相关推荐

热门问题

热门文章

计算精度和查全率的结果似乎有些奇怪

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >