使用Python从.txt文件填充SQLite3数据库

13 投票

6 回答

7670 浏览

数据工程师

提问于 2025-04-16 01:26

我正在尝试用Django搭建一个网站，让用户可以查询关于他们在欧洲议会代表的信息。我有一个用逗号分隔的.txt文件，里面的数据格式如下：

议会, 姓名, 国家, 政党团体, 国家政党, 职位

7, Marta Andreasen, 英国, 自由与民主欧洲集团, 英国独立党, 成员

等等……

我想把这些数据放到一个SQLite3数据库里，但到目前为止，我找到的所有教程都只教怎么手动输入数据。因为我文件里有736条记录，我真的不想一个一个手动输入。

我觉得这应该是个简单的事情，但如果有人能教我怎么做，我会非常感激。

托马斯

数据库 django 数据查询文本文件处理数据格式 sqlite3 记录管理数据导入

6 个回答

你可以使用csv模块来读取数据。然后，你可以创建一个插入的SQL语句，并使用executemany这个方法来执行：

  cursor.executemany(sql, rows)

如果你使用的是sqlalchemy的话，也可以用add_all这个方法。

回答于 2025-04-16 由 Python大师

分享举报

正如SiggyF所说，和Joschua的说法略有不同：

首先，创建一个文本文件，里面写上你的数据库结构，比如：

CREATE TABLE politicians (
    Parliament text, 
    Name text, 
    Country text, 
    Party_Group text, 
    National_Party text, 
    Position text
);

创建表格：

>>> import csv, sqlite3
>>> conn = sqlite3.connect('my.db')
>>> c = conn.cursor()
>>> with open('myschema.sql') as f:            # read in schema file 
...   schema = f.read()
... 
>>> c.execute(schema)                          # create table per schema 
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()                              # commit table creation

接下来，使用csv模块来读取包含要插入数据的文件：

>>> csv_reader = csv.reader(open('myfile.txt'), skipinitialspace=True)
>>> csv_reader.next()                          # skip the first line in the file
['Parliament', 'Name', 'Country', ...

# put all data in a tuple
# edit: decoding from utf-8 file to unicode
>>> to_db = tuple([i.decode('utf-8') for i in line] for line in csv_reader)
>>> to_db                                      # this will be inserted into table
[(u'7', u'Marta Andreasen', u'United Kingdom', ...

然后，插入数据：

>>> c.executemany("INSERT INTO politicians VALUES (?,?,?,?,?,?);", to_db)
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()

最后，确认一切都按预期进行：

>>> c.execute('SELECT * FROM politicians').fetchall()
[(u'7', u'Marta Andreasen', u'United Kingdom', ...

补充说明:
因为你在输入时已经进行了解码（转为unicode），所以在输出时也要确保进行编码。
举个例子：

with open('encoded_output.txt', 'w') as f:
  for row in c.execute('SELECT * FROM politicians').fetchall():
    for col in row:
      f.write(col.encode('utf-8'))
      f.write('\n')

回答于 2025-04-16 由 Python大师

分享举报

假设你的 models.py 文件大概是这样的：

class Representative(models.Model):
    parliament = models.CharField(max_length=128)
    name = models.CharField(max_length=128)
    country = models.CharField(max_length=128)
    party_group = models.CharField(max_length=128)
    national_party = models.CharField(max_length=128)
    position = models.CharField(max_length=128)

接下来，你可以运行 python manage.py shell，然后执行以下命令：

import csv
from your_app.models import Representative
# If you're using different field names, change this list accordingly.
# The order must also match the column order in the CSV file.
fields = ['parliament', 'name', 'country', 'party_group', 'national_party', 'position']
for row in csv.reader(open('your_file.csv')):
    Representative.objects.create(**dict(zip(fields, row)))

这样就完成了。

补充说明（编辑）

根据托马斯的要求，这里解释一下 **dict(zip(fields,row)) 是干什么的：

首先，fields 是我们定义的字段名称的列表，而 row 是表示当前 CSV 文件中一行数据的值的列表。

fields = ['parliament', 'name', 'country', ...]
row = ['7', 'Marta Andreasen', 'United Kingdom', ...]

zip() 的作用是把两个列表合并成一个包含成对项目的新列表（就像拉链一样）；比如 zip(['a','b','c'], ['A','B','C']) 会返回 [('a','A'), ('b','B'), ('c','C')]。在我们的例子中：

>>> zip(fields, row)
[('parliament', '7'), ('name', 'Marta Andreasen'), ('country', 'United Kingdom'), ...]

dict() 函数就是把这些成对的列表转换成一个字典。

>>> dict(zip(fields, row))
{'parliament': '7', 'name': 'Marta Andreasen', 'country': 'United Kingdom', ...}

** 是一种把字典转换成函数关键字参数列表的方法。所以 function(**{'key': 'value'}) 相当于 function(key='value')。在我们的例子中，调用 create(**dict(zip(fields, row))) 相当于：

create(parliament='7', name='Marta Andreasen', country='United Kingdom', ...)

希望这样能让你更明白。

回答于 2025-04-16 由 Python大师

分享举报

使用Python从.txt文件填充SQLite3数据库

6 个回答

撰写回答