CSV到sqlite3数据库。转换从Utf8到Unicode的列表

2024-03-28 21:11:16 发布

您现在位置:Python中文网/ 问答频道 /正文

import csv, sqlite3

conn = sqlite3.connect("mycustomers9.sql")
curs =  conn.cursor()
try:
    curs.execute("CREATE TABLE t (unknown1 TEXT, county TEXT, businessName TEXT, address1 TEXT, city1 TEXT, zip1 INTEGER, phone1 INTEGER,Email1 TEXT, approvalstatus TEXT, date1 TEXT, date2 TEXT, typeofConstruct TEXT, typeofBiz TEXT, unknown2 TEXT, unknown3 TEXT, unknown4 TEXT, unknown5 TEXT, unknown6 TEXT,BizName2 TEXT,Address2 TEXT, City2 TEXT,Zip2 TEXT,Country2 TEXT,Phone2 TEXT,Email2 TEXT,Phone3 TEXT);")
except sqlite3.OperationalError:
    print "Table already exist"
with open('HR_plan_review.csv', 'rb') as infile:
    dr = csv.DictReader(infile, delimiter = ',')
    to_db = [(i["unknown1"], i['county'], i['businessName'], i['address1'], i['city1'], i['zip1'], i['phone1'], i['Email1'], i['approvalstatus'], i['date1'],i['date2'], i['typeofConstruct'], i['typeofBiz'], i['unknown2'], i['unknown3'], i['unknown4'], i['unknown5'], i['unknown6'], i['BizName2'], i['Address2'], i['City2'], i['Zip2'], i['Country2'], i['Phone2'], i['Email2'], i['Phone3']) for i in dr]

curs.executemany("INSERT INTO t (unknown1, county, businessName, address1, city1,zip1, phone1, Email1, approvalstatus, date1, date2,typeofConstruct, typeofBiz, unknown2, unknown3, unknown4,unknown5, unknown6,BizName2,Address2, City2,Zip2,Country2,Phone2,Email2,Phone3) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?);", to_db)

to_db返回一个以utf-8编码的列表,sqllite数据库似乎正在请求以unicode格式进行格式化。在运行上面的sql语句之前,如何将“to_db”列表转换为unicode。下面是我运行上述代码时得到的错误消息。在

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a te xt_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode str ings.

根据答案输入进行编辑

修改后的代码(如下)现在成功执行,但它没有将从csv获取的值插入数据库。在

^{pr2}$

Tags: csvtotextdbsqlite3countycursaddress1
3条回答

遗憾的是,csv无法处理unicode(至少在python2.7中是这样)。但您可以通过在生成器中包装DictReader来解决此问题:

with open('HR_plan_review.csv', 'rb') as infile:
    dr = csv.DictReader(infile, delimiter = ',')
    def unicoded_data():
        for row in dr:
            # Assuming infile encoding is utf-8.
            yield dict([(key, unicode(value, encoding='utf-8'))
                       for key, value in row.iteritems()])

    to_db = [(i["unknown1"], i['county'], i['businessName'], i['address1'], i['city1'], i['zip1'], i['phone1'], i['Email1'], i['approvalstatus'], i['date1'],i['date2'], i['typeofConstruct'], i['typeofBiz'], i['unknown2'], i['unknown3'], i['unknown4'], i['unknown5'], i['unknown6'], i['BizName2'], i['Address2'], i['City2'], i['Zip2'], i['Country2'], i['Phone2'], i['Email2'], i['Phone3']) for i in unicoded_data()]

您需要提交执行:

conn.commit()

另外,您的executemany语句应该放在“with”块中:

^{pr2}$
keys=("unknown1", 'county', 'businessName', 'address1',
    'city1', 'zip1', 'phone1', 'Email1', 'approvalstatus',
    'date1','date2', 'typeofConstruct', 'typeofBiz', 'unknown2',
    'unknown3', 'unknown4', 'unknown5', 'unknown6', 'BizName2',
    'Address2', 'City2', 'Zip2', 'Country2', 'Phone2',
    'Email2', 'Phone3')
args=[tuple(i[key].decode('utf-8') for key in keys) for row in dr]
sql='INSERT INTO t ({f}) VALUES ({p})'.format(
    f=','.join(keys),
    p=','.join(['?']*len(keys)))
curs.executemany(sql, args)

或者,为了获得更可靠的解决方案,您可以使用UnicodeDictReader,UnicodeReader (from the csv docs)的稍作修改,以dict形式返回具有unicode值的行:

^{pr2}$

我在上面发布的代码仍然可以使用,只要更改一下

args=[tuple(i[key].decode('utf-8') for key in keys) for row in dr]

args=[tuple(i[key] for key in keys) for row in dr]

相关问题 更多 >