我使用Scikit RandomForestClassifier对不平衡数据进行分类。目标类数据为“1”或“0”(99%的值为0)。在
我想分配一个权重。我怎么能做到呢。在
我在文献中发现:
sample_weight : array-like, shape = [n_samples] or None
Sample weights. If None, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. In the case of classification, splits are also ignored if they would result in any single class carrying a negative weight in either child node.
我需要增加“1”的影响力
我应该这样做吗:
s_weight = np.array([100 if i == 1 else 1 for i in y_train])
或者这样:
^{pr2}$一。在
clf.fit(X_train, y_train, sample_weight=s_weights)
由于我没有得到预期的结果,谁能确认一下?在
从技术上讲
是正确的,尽管在RF中加权并不像支持向量机那样简单。您必须交叉验证才能找到最佳权重(可能比
100
小得多)。在相关问题 更多 >
编程相关推荐