如何通过预测概率获得每个样本的所有类别的独立概率？

2条回答

网友

1楼 · 编辑于 2024-05-12 18:01:03

predict_prob计算每个类别一个样本的概率。[0.95 0.05]表示在模型的95%决策树中，这些唯一样本的输出为0类；5%为一级。因此，您正在逐个评估每个样本

当您这样做时：

classifier.predict_proba(example_feature_set)[0]

对于example_feature_set的第一个样本，您将获得成为每个类的概率

我想你想要的是每门课的准确度或召回率。（如果您不熟悉，请检查这些分数的含义）

要计算这些，我建议使用以下代码：

from sklearn.metrics import classification_report
y_pred=classifier.predict(example_feature_set) #I'm assuming you have more than one sample to predict
print(classification_report(y_test,y_pred))

然后，您将获得一些可以帮助您的措施

网友
2楼 · 编辑于 2024-05-12 18:01:03

随机林是一个ensemble method。基本上，它使用不同的数据子集（称为bagging）构建单独的决策树，并对所有树的预测进行平均，以给出概率。“帮助”页实际上是一个很好的起点：
In averaging methods, the driving principle is to build several estimators independently and then to average their predictions. On average, the combined estimator is usually better than any of the single base estimator because its variance is reduced.
Examples: Bagging methods, Forests of randomized trees, …
因此，概率总和总是为一。下面是如何访问每个树的单个预测的示例：
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split iris = load_iris() X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.33, random_state=42) from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier(n_estimators=10) model.fit(X_train, y_train) pred = model.predict_proba(X_test) pred[:5,:] array([[0. , 1. , 0. ], [1. , 0. , 0. ], [0. , 0. , 1. ], [0. , 0.9, 0.1], [0. , 0.9, 0.1]])
这是对第一棵树的预测：
model.estimators_[0].predict(X_test) Out[42]: array([1., 0., 2., 2., 1., 0., 1., 2., 2., 1., 2., 0., 0., 0., 0., 2., 2., 1., 1., 2., 0., 2., 0., 2., 2., 2., 2., 2., 0., 0., 0., 0., 1., 0., 0., 2., 1., 0., 0., 0., 2., 2., 1., 0., 0., 1., 1., 2., 1., 2.])
我们记录了所有树木：
result = np.zeros((len(X_test),3)) for i in range(len(model.estimators_)): p = model.estimators_[i].predict(X_test).astype(int) result[range(len(X_test)),p] += 1 result[:5,:] Out[63]: array([[ 0., 10., 0.], [10., 0., 0.], [ 0., 0., 10.], [ 0., 9., 1.], [ 0., 9., 1.]])
将其除以树的数量，即可得出您之前获得的概率：
result/10 Out[65]: array([[0. , 1. , 0. ], [1. , 0. , 0. ], [0. , 0. , 1. ], [0. , 0.9, 0.1], [0. , 0.9, 0.1],

相关问题更多 >

编程相关推荐

热门问题

热门文章