值错误：y包含新标签：['#']

2024-05-16 06:30:22 发布

您现在位置：Python中文网/ 问答频道 /正文

1790

网友

男 | 程序猿一只，喜欢编程写python代码。

我有一个列表，每个列表包含1到5个标签。我已经构建了一个包含前50个标签的列表。我的目标是构建一个新的列表列表，其中每个列表只包含前50个标记。我的方法是这样的：

首先，我构建了一个新列表，其中只有前50个标记：

top_50 = list(np.array(pd.read_csv(os.path.join(dir,"Tags.csv")))[:,1])
train = pd.read_csv(os.path.join(dir,"Train.csv"),iterator = True)
top_50 = top_50[:51]
tags = list(np.array(train.get_chunk(50000))[:,3])

top_50_tags = [[tag for tag in list if tag in top_50] for list in tags]

然后我试着对标签进行编码：

    coder = preprocessing.LabelEncoder()  
    coder = coder.fit(top_50)
    tags = [coder.transform(tag) for tag in list for list in top_50_tags]

但这给了我一个错误：

Traceback (most recent call last):
  File "C:\Users\Ano\workspace\final_submission\src\rf_test.py", line 69, in <module>
    main()
  File "C:\Users\Ano\workspace\final_submission\src\rf_test.py", line 33, in main
    labels = [coder.transform(tag) for tag in list for list in top_50_tags]
  File "C:\Python27\lib\site-packages\sklearn\preprocessing\label.py", line 120, in transform
    raise ValueError("y contains new labels: %s" % str(diff))
ValueError: y contains new labels: ['#']

我认为这个错误会上升，因为我的一些列表是空的，因为其中没有前50个标签。但错误特别指出，[“#”]是新出现的标签。我的假设对吗？我应该如何处理错误信息？
编辑： 对于那些想知道我为什么在列表理解中使用列表作为变量的人，我实际上在我的实际程序中使用了一个不同的词作为变量。

更新

我检查了我的前50名和标签的差异：

print(len(top_50.difference(tags)))

给了我0的长度。这应该意味着我的空名单是问题所在？

Tags： csv in py 列表 for top tag 错误

0条回答

目前没有回答

值错误：y包含新标签：['#']

相关问题更多 >

编程相关推荐

热门问题

热门文章

值错误：y包含新标签：['#']

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >