sqlalchemy UnicodeDecodeError: 'utf8' 编码无法解码字节 0xe7，查询 mssql 2005 表时出错

Question

我正在开发一个应用程序，它需要和另一个应用程序的数据库打交道，使用的是mssql 2005（我不能更改这个或现有的表结构）。这个mssql表的字符集是“hebrew bin”，而且应用程序能够完美显示表中的希伯来文，所有的.py文件都是用utf-8编码的。

注意！使用unicode希伯来文字符串写入数据库时，没有任何问题。选择和删除数据也没问题：DBSession2.query(object).filter(object.LOADED=='Y').delete()，但是在从表中选择数据时，我遇到了一个非常烦人的错误：

  File "D:\Python27\learn\agent\agent\lib\encodings\utf_8.py", line 16, in decode     return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe0 in position 0: invalid continuation byte

而且这个错误的具体字节代码会根据表中第一行的第一个字节而变化。

是的，我知道这代表一个希伯来字母——这不应该是个问题，因为所有的地方都在使用unicode——至少我之前是这么认为的。

顺便说一下，这在测试用的mssql 2005服务器上运行得很好，但在生产服务器上却不行。

一些代码：这是函数中出错的部分：

def iterateJson(parser,injson,object):
    '''iterateJson(parser,injson,object):getting a parser method an a json and iterating over the json
    with the parser method, checkes for existing objects in the db table and deletes them before commiting the new one to
    prevent integerityerrors
    writes ValidateJsonError to errorlog for each element in the json
    getting an onject name to check for loaded etc'''

    #first lets erase the table from loaded objects
    DBSession2.query(object).filter(object.LOADED=='Y').delete()
    print "finished deleting loaded"
    #now lets get a list from the table of loaded id
    raw_list = DBSession2.query(object).all() #the failing part!
    print "getting raw list of unloaded" #doesn't get here!
    if object == Activities:
        id_list = [e.EVENTCODE for e in raw_list]
        id = e.EVENTCODE

这是sqlalchemy类的一部分：

class Deposit(DeclarativeBase2):
    __tablename__ = 'NOAR_LOADDEPOSIT'
    #LINE = Column(INT(8)) 
    RECDEBNUM = Column(NVARCHAR(9) , primary_key=True)
    CURDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    PAYTYPE = Column(CHAR(1), nullable=False, default=text(u"('')")) 
    BANKCODE = Column(NVARCHAR(8), nullable=False, default=text(u"('')")) 
    CUSTACCNAME = Column(NVARCHAR(16), nullable=False, default=text(u"('')")) 
    PAGENUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')"))
    RECNUM = Column(NVARCHAR(2), nullable=False, default=text(u"('')")) 
    RECDATE = Column(BIGINT, nullable=False, default=text(u'((0))')) 
    FIXNUM = Column(NCHAR(1), nullable=False, default=text(u"('')")) 
    EVENTNUM = Column(NVARCHAR(5), nullable=False, default=text(u"('')")) 
    GROUPCODE = Column(NVARCHAR(7), nullable=False, default=text(u"('')")) 
    IDNUMBER = Column(NVARCHAR(9), nullable=False, default=text(u"('')"))

还有另一个类（这两个类都出现同样的问题）：

class Activities(DeclarativeBase2):  


    __tablename__ = 'NOAR_LOADEVENTS'

    EVENTCODE = Column(NVARCHAR(8), primary_key=True)
    EVENTDES = Column(Unicode, nullable=False, default=text(u"('')"))
    TYPE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LC = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LD = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LE = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LF = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LG = Column(NCHAR(1), nullable=False, default=text(u"('')"))
    LH = Column(NCHAR(1), nullable=False, default=text(u"('')"))

我使用的是：python 2.7（64位Windows），pyodbc 2.1.11，mssql server 2005，sqlalchemy 0.7.3，tg2.1.3

任何帮助或参考资料都非常感谢！

error handling unicode sqlalchemy character encoding pyodbc mssql database interaction hebrew

sqlalchemy UnicodeDecodeError: 'utf8' 编码无法解码字节 0xe7，查询 mssql 2005 表时出错

2 个回答

撰写回答