ValueError：解包值过多 scikitlearn.train

Question

我现在正在做情感分析，想测试一些分类器的准确性。如果我没有把训练集转换成字典，就会出现错误："AttributeError: 'tuple' object has no attribute 'iterkeys'"。但是在我把它转换成字典后，又出现了另一个错误：

Traceback (most recent call last):
  File "E:\Python27\accuracy.py", line 204, in <module>
    print 'BernoulliNB`s accuracy is %f' %score(BernoulliNB())
  File "E:\Python27\accuracy.py", line 200, in score
    classifier.train(trainset)
    File "E:\Python27\lib\site-packages\nltk\classify\scikitlearn.py", line 93, in train
        for fs, label in labeled_featuresets:
    ValueError: too many values to unpack

部分代码：

trainset = extracted_pos_features[50:]+extracted_neg_features[50:]
testset = extracted_pos_features[:50]+extracted_neg_features[:50]
dict1 = {}
for i,j in trainset:
    dict1.setdefault(j,[]).append(i)

trainset = dict1

test, tag_test = zip(*testset)

def score(classifier):
    classifier = SklearnClassifier(classifier)
    classifier.train(trainset)
    pred = classifier.batch_classify(test)
    return accuracy_score(tag_test, pred)

print 'BernoulliNB`s accuracy is %f' %score(BernoulliNB())

在字典dict1中，有两个键'neg'和'pos'，每个键都有多个值：

dict1

{'neg': [('tone', 'ultimately'), ('tragedy', 'core'), ('ultimately', 'dulls'), ('update', 'dreary'), ('version', 'looks'), ('voice', 'lack'), ('worst', 'film'), ('yarn', 'eloquent'), ('makes', 'little'), ('makes', 'maryam'), ('remain', 'true'), ('screen', 'time'), ('sluggish', 'time'), ('thesis', 'makes'), ('time', 'machine'), ('true', 'chan'), ('true', 'original'), ('unashamedly', 'makes'), ('time', 'true')], 

'pos': [('rock', 'destined'), ('schwarzenegger', 'van'), ('screenplay', 'curls'), ('segal', 'gorgeously'), ('slice', 'asian'), ('snappy', 'screenplay'), ('somehow', 'pulls'), ('sometimes', 'movies'), ('splash', 'arnold'), ('start', 'emerges'), ('steers', 'snappy'), ('steven', 'segal'), ('top', 'game'), ('trilogy', 'huge'), ('van', 'damme'), ('vision', 'effective'), ('wasabi', 'start'), ('words', 'adequately'), ('cat', 'offers'), ('emerges', 'rare'), ('game', 'offers'), ('offers', 'refreshingly'), ('rare', 'combination'), ('rare', 'issue'), ('offers', 'rare')]}

有没有人知道该怎么解决这个问题？非常感谢。

错误处理数据结构 scikit-learn 训练集分类器情感分析

ValueError：解包值过多 scikitlearn.train

1 个回答

撰写回答