训练你的模型和支持回忆/精确性的最佳方法是什么?

2024-04-26 05:25:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个二进制分类的问题,我的数据集是由5%的积极标签。我在用张量流训练我的模型。以下是我在培训期间的结果:

Step 3819999: loss = 0.22 (0.004 sec)
Accuracy = 0.955; Recall = 0.011; Precision = 0.496

Step 3820999: loss = 0.21 (0.003 sec)
Accuracy = 0.955; Recall = 0.011; Precision = 0.496

Step 3821999: loss = 0.15 (0.003 sec)
Accuracy = 0.955; Recall = 0.011; Precision = 0.496

Step 3822999: loss = 0.15 (0.003 sec)
Accuracy = 0.955; Recall = 0.011; Precision = 0.496

提高回忆的主要策略是什么? 改变数据集和增加更多的正面标签可以解决问题,但改变问题的现实似乎很奇怪。。。你知道吗

在我看来,应该有一个方法来支持“真阳性”而不是“假阴性”,但我似乎找不到一个。你知道吗


Tags: 数据模型step二进制分类标签sec策略
1条回答
网友
1楼 · 发布于 2024-04-26 05:25:50

您应该使用“weighted cross entropy”而不是传统的CE。来自Tensorflow文档:

This is like sigmoid_cross_entropy_with_logits() except that pos_weight, allows one to trade off recall and precision by up- or down-weighting the cost of a positive error relative to a negative error. The usual cross-entropy cost is defined as:

targets * -log(sigmoid(logits)) + (1 - targets) * -log(1 - sigmoid(logits))

A value pos_weights > 1 decreases the false negative count, hence increasing the recall. Conversely setting pos_weights < 1 decreases the false positive count and increases the precision. This can be seen from the fact that pos_weight is introduced as a multiplicative coefficient for the positive targets term in the loss expression:

targets * -log(sigmoid(logits)) * pos_weight + (1 - targets) * -log(1 - sigmoid(logits))

相关问题 更多 >