无法将pandas数据框导出到Excel/编码问题

7 投票
1 回答
14664 浏览
提问于 2025-04-18 00:07

我在导出我的一个数据框时遇到了编码问题,导致无法成功导出。

sjM.dtypes

Customer Name              object
Total Sales               float64
Sales Rank                float64
Visit_Frequency           float64
Last_Sale          datetime64[ns]
dtype: object

导出为csv格式是没问题的

path = 'c:\\test'
sjM.to_csv(path + '.csv')   # Works

但是导出为excel格式就失败了

sjM.to_excel(path + '.xls')

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "testing.py", line 338, in <module>
    sjM.to_excel(path + '.xls')
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0x81 in position 22: ordinal not in range(128)

我知道问题出在“客户名称”这一列,因为删除了这一列后,导出到excel就能正常工作了。

我尝试按照一个问题中的建议(Python pandas to_excel 'utf8' codec can't decode byte)使用一个函数来解码并重新编码有问题的这一列

def changeencode(data):
    cols = data.columns
    for col in cols:
        if data[col].dtype == 'O':
            data[col] = data[col].str.decode('latin-1').str.encode('utf-8')
    return data

sJM = changeencode(sjM)

sjM['Customer Name'].str.decode('utf-8')

L2-00864                         SETIA 2
K1-00279                     BERKAT JAYA
L2-00664                        TK. ANTO
BR00035                   BRASIL JAYA,TK
RA00011               CV. RAHAYU SENTOSA

所以转换为unicode看起来是成功的

sjM.to_excel(path + '.xls')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "c:\Anaconda\Lib\site-packages\pandas\core\frame.py", line 1197, in to_excel
    excel_writer.save()
  File "c:\Anaconda\Lib\site-packages\pandas\io\excel.py", line 595, in save
    return self.book.save(self.path)
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 662, in save
    doc.save(filename, self.get_biff_data())
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 637, in get_biff_data
    shared_str_table   = self.__sst_rec()
  File "c:\Anaconda\Lib\site-packages\xlwt\Workbook.py", line 599, in __sst_rec
    return self.__sst.get_biff_record()
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 76, in get_biff_record
    self._add_to_sst(s)
  File "c:\Anaconda\Lib\site-packages\xlwt\BIFFRecords.py", line 91, in _add_to_sst
    u_str = upack2(s, self.encoding)
  File "c:\Anaconda\Lib\site-packages\xlwt\UnicodeUtils.py", line 50, in upack2
    us = unicode(s, encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 22: ordinal not in range(128)
  1. 为什么即使转换为unicode看起来成功了,导出还是失败呢?
  2. 我该如何解决这个问题,以便将这个数据框导出到excel呢?

@Jeff

感谢你给我指明了正确的方向

使用的步骤:

安装xlsxwriter(这个库没有和pandas一起打包)

sjM.to_excel(path + '.xlsx', sheet_name='Sheet1', engine='xlsxwriter')

1 个回答

3

你需要使用版本大于等于0.13的pandas库,并且要用xlsxwriter这个引擎来处理Excel文件,因为它支持原生的Unicode写入。默认的引擎xlwt在0.14版本中会支持传递编码选项。

想了解更多关于引擎的文档,可以查看这里

撰写回答