如何使用scikit learn计算多类情况下的精确度、召回率、准确率和f1-score？问题的回答

如何使用scikit learn计算多类情况下的精确度、召回率、准确率和f1-score？

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我正在处理一个情绪分析问题，数据如下： <pre><code>label instances 5 1190 4 838 3 239 1 204 2 127 </code></pre> 所以我的数据是不平衡的，因为1190<code>instances</code>被标记为<code>5</code>。使用scikit的<a href="http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html" rel="noreferrer">SVC</a>对Im进行分类。问题是我不知道如何以正确的方式平衡我的数据，以便准确计算多类情况下的精度、召回率、准确性和f1分数。所以我尝试了以下方法： 首先： <pre><code> wclf = SVC(kernel='linear', C= 1, class_weight={1: 10}) wclf.fit(X, y) weighted_prediction = wclf.predict(X_test) print 'Accuracy:', accuracy_score(y_test, weighted_prediction) print 'F1 score:', f1_score(y_test, weighted_prediction,average='weighted') print 'Recall:', recall_score(y_test, weighted_prediction, average='weighted') print 'Precision:', precision_score(y_test, weighted_prediction, average='weighted') print '\n clasification report:\n', classification_report(y_test, weighted_prediction) print '\n confussion matrix:\n',confusion_matrix(y_test, weighted_prediction) </code></pre> 第二： <pre><code>auto_wclf = SVC(kernel='linear', C= 1, class_weight='auto') auto_wclf.fit(X, y) auto_weighted_prediction = auto_wclf.predict(X_test) print 'Accuracy:', accuracy_score(y_test, auto_weighted_prediction) print 'F1 score:', f1_score(y_test, auto_weighted_prediction, average='weighted') print 'Recall:', recall_score(y_test, auto_weighted_prediction, average='weighted') print 'Precision:', precision_score(y_test, auto_weighted_prediction, average='weighted') print '\n clasification report:\n', classification_report(y_test,auto_weighted_prediction) print '\n confussion matrix:\n',confusion_matrix(y_test, auto_weighted_prediction) </code></pre> 第三： <pre><code>clf = SVC(kernel='linear', C= 1) clf.fit(X, y) prediction = clf.predict(X_test) from sklearn.metrics import precision_score, \ recall_score, confusion_matrix, classification_report, \ accuracy_score, f1_score print 'Accuracy:', accuracy_score(y_test, prediction) print 'F1 score:', f1_score(y_test, prediction) print 'Recall:', recall_score(y_test, prediction) print 'Precision:', precision_score(y_test, prediction) print '\n clasification report:\n', classification_report(y_test,prediction) print '\n confussion matrix:\n',confusion_matrix(y_test, prediction) F1 score:/usr/local/lib/python2.7/site-packages/sklearn/metrics/classification.py:676: DeprecationWarning: The default `weighted` averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for `average`, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) /usr/local/lib/python2.7/site-packages/sklearn/metrics/classification.py:1172: DeprecationWarning: The default `weighted` averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for `average`, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) /usr/local/lib/python2.7/site-packages/sklearn/metrics/classification.py:1082: DeprecationWarning: The default `weighted` averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for `average`, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) 0.930416613529 </code></pre> 但是，我收到这样的警告： <pre><code>/usr/local/lib/python2.7/site-packages/sklearn/metrics/classification.py:1172: DeprecationWarning: The default `weighted` averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for `average`, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1" </code></pre> 如何正确处理不平衡的数据，以便以正确的方式计算分类器的度量？

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

如何使用scikit learn计算多类情况下的精确度、召回率、准确率和f1-score？

1 个回答

相关Python问题