如何在Python scikitlearn中从随机林中的每棵树输出回归预测？

3条回答

网友

1楼 · 编辑于 2024-04-28 06:34:16

我也有同样的问题，我不知道你是如何通过使用print(clf.estimators_[tree].predict(val.irow(1)))得到正确答案的。它给了我随机数，而不是实际的类。在阅读了SKlearn中的源代码之后，我意识到我们实际上必须在代码中使用predict_proba()而不是predict，并且它提供了树根据clf.classes_中的顺序预测的类。例如：

tree_num = 2
tree_pred = clf.estimators_[tree_num].predict_proba(data_test)
print clf.classes_  #gives you the order of the classes
print tree_pred  #gives you an array of 0 with the predicted class as 1
>>> ['class1','class2','class3']
>>> [0, 1, 0]

您还可以在数据上使用cls.predict_proba（），它通过树的累积为您提供了每个类的预测概率，并使您从亲自浏览每个树的痛苦中解脱出来：

x = clf.predict_proba(data_test) # assume data_test has two instances
print rfc.classes_
print x
>>> ['class1', 'class2', 'class3']
>>> [[0.12 ,  0.02,  0.86], # probabilities for the first instance
     [0.35 ,  0.01,  0.64]]  # for the second instance

网友

2楼 · 编辑于 2024-04-28 06:34:16

我最近做的是修改sklearn源代码以获得它。内部学习包 sklearn.ensemble.Randomforestregressor

有一个功能，如果您添加打印，您将看到每个树的单独结果。您可以将其更改为返回，并获得每个树的单独结果

def _accumulate_prediction(predict, X, out, lock):
    """
    This is a utility function for joblib's Parallel.

    It can't go locally in ForestClassifier or ForestRegressor, because joblib
    complains that it cannot pickle it when placed there.
    """
    prediction = predict(X, check_input=False)
    print(prediction)
    with lock:
        if len(out) == 1:
            out[0] += prediction
        else:
            for i in range(len(out)):
                out[i] += prediction[i]

这有点复杂，因为您必须修改sklearn源代码

网友

3楼 · 编辑于 2024-04-28 06:34:16

我很确定你所拥有的是你能做的最好的。正如您所指出的，predict()返回整个RF的预测，但不返回其组件树的预测。它可以返回一个矩阵，但这仅适用于同时学习多个目标的情况。在这种情况下，它会为每个目标返回一个预测，而不会为每个树返回预测。您可以使用predict.all = True在R的随机林中获得单个树的预测，但是sklearn没有。如果您尝试使用apply()，您将得到一个叶索引矩阵，然后您仍然需要在树上迭代，以找出该树/叶组合的预测结果。所以我认为你所拥有的是最好的

相关问题更多 >

编程相关推荐

热门问题

热门文章