使用python（xlrd包）从excel表中提取数据时获取一些垃圾数据

0 投票

3 回答

1068 浏览

提问于 2025-04-17 19:51

我正在用Python（xlrd库）从Excel表格中提取数据。但是在提取的过程中，出现了一些杂乱的数据，我需要帮助来去掉这些数据。

我得到了一些像这样的杂乱值：[text:u，如下所示：

[text:u'NAME', text:u'JACK']

<CODE>

from xlrd import open_workbook

book = open_workbook('C:/Users/arun/Desktop/EX.xls')
sheet0 = book.sheet_by_index(0)
#sheet1 = book.sheet_by_index(1)

print sheet0.col(0) 
print sheet0.col(2)
print sheet0.col(3)
print sheet0.col(4)
print sheet0.col(5)
print sheet0.col(6)
print sheet0.col(7)
print sheet0.col(8)
print sheet0.col(9)
print sheet0.col(10)
print sheet0.col(12)
print sheet0.col(13)
print sheet0.col(14)
print sheet0.col(15)
print sheet0.col(16)
print sheet0.col(17)
print sheet0.col(18)
print sheet0.col(19)
print sheet0.col(20)
print sheet0.col(21)
print sheet0.col(22)
print sheet0.col(23)
print sheet0.col(24)
print sheet0.col(25)
print sheet0.col(26)
print sheet0.col(27)

</CODE>

数据提取数据预处理 excel处理垃圾数据清理 xlrd库

3 个回答

如果你在使用sheet.cell()的时候，我建议你用sheet.cell_value，这样可以避免多出来的（文本: '）。

回答于 2025-04-17 由 Python大师

分享举报

我发现用 str(sheet0.col(0)) 这个方法会去掉文本中的 :u 输出。

回答于 2025-04-17 由 Python大师

分享举报

你所说的“垃圾”其实看起来像是xlrd库中一系列类型为“文本”的Cell对象。也就是说，我觉得这并不是垃圾；它是xlrd正常返回的结果。如果你想获取某一列的值（比如第0列），可以试试下面的代码：

print sheet0.col_values(0)

你可以查看这个链接了解更多信息：https://secure.simplistix.co.uk/svn/xlrd/trunk/xlrd/doc/xlrd.html#sheet.Sheet.col_values-method

回答于 2025-04-17 由 Python大师

分享举报

使用python（xlrd包）从excel表中提取数据时获取一些垃圾数据

3 个回答

撰写回答