在Python中创建阈值编码的ROC图
R的ROCR包提供了一些选项,可以绘制ROC曲线,并在曲线上用颜色和标签标记阈值:
在Python中,我能做到的最接近的效果是这样的:
from sklearn.metrics import roc_curve
fpr, tpr, thresholds = roc_curve(qualityTrain.PoorCare, qualityTrain.Pred1)
plt.plot(fpr, tpr, label='ROC curve', color='b')
plt.axes().set_aspect('equal')
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
这会生成:
有没有什么包可以实现和R一样的功能,能够标记(使用print.cutoffs.at
)和用颜色标记(使用colorize
)阈值?我猜这些信息在thresholds
中,由sklearn.metrics.roc_curve
返回,但我不知道怎么用它来给图形上色和标记。
2 个回答
0
import sklearn # for the roc curve
import matplotlib.pyplot as plt
def plot_roc(labels, predictions, positive_label, thresholds_every=10, title=''):
# fp: false positive rates. tp: true positive rates
fp, tp, thresholds = sklearn.metrics.roc_curve(labels, predictions, pos_label=positive_label)
roc_auc = sklearn.metrics.auc(fp, tp)
figure(figsize=(16, 16))
plt.plot(fp, tp, label='ROC curve (area = %0.2f)' % roc_auc, linewidth=2, color='darkorange')
plt.plot([0, 1], [0, 1], color='navy', linestyle='--', linewidth=2)
plt.xlabel('False positives rate')
plt.ylabel('True positives rate')
plt.xlim([-0.03, 1.0])
plt.ylim([0.0, 1.03])
plt.title(title)
plt.legend(loc="lower right")
plt.grid(True)
# plot some thresholds
thresholdsLength = len(thresholds)
colorMap=plt.get_cmap('jet', thresholdsLength)
for i in range(0, thresholdsLength, thresholds_every):
threshold_value_with_max_four_decimals = str(thresholds[i])[:5]
plt.text(fp[i] - 0.03, tp[i] + 0.005, threshold_value_with_max_four_decimals, fontdict={'size': 15}, color=colorMap(i/thresholdsLength));
plt.show()
用法:
labels = [1, 1, 2, 2, 2, 3]
predictions = [0.7, 0.99, 0.9, 0.3, 0.7, 0.01] # predictions/accuracy for class 1
plot_roc(labels, predictions, positive_label=1, thresholds_every=1, title="ROC Curve - Class 1")
结果: 绘图结果
9
看看这个链接:
https://gist.github.com/podshumok/c1d1c9394335d86255b8
roc_data = sklearn.metrics.roc_curve(...)
plot_roc(*roc_data, label_every=5)