在命令行或Python中将CSV导入SQLite数据库
根据Bernie的要求,我想把这个内容简化成一个更简单的例子:
我有一个CSV文件,里面有一个月份,星期几是列标题:
Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday
1,2,3,4,5,6,7
8,9,10,11,12,13,14
15,16,17,18,19,20,21
22,23,24,25,26,27,28
在命令行中,我创建了一个SQLite表,叫做days:
sqlite> CREATE TABLE days(
...> Monday int,
...> Tuesday int,
...> Wednesday int,
...> Thursday int,
...> Friday int,
...> Saturday int,
...> Sunday int
...> );
当我尝试从这个CSV文件导入数据时,出现了这样的情况:
sqlite> .import example.csv days
Error: example.csv line 1: expected 7 columns of data but found 1
我该如何将这个CSV文件导入到数据库中,让它能识别每一行新数据呢?谢谢!
3 个回答
0
看看这个工具,叫做termsql,它就是为了像这样的需求而设计的。用它来完成你的任务会简单很多。
手册在这里:http://tobimensch.github.io/termsql/
页面底部有一些示例,里面有一个关于CSV导入的例子,记得看看不同的选项是什么。
0
SQLite的命令行工具有点麻烦。现在的版本不支持行头,而且它的行为在标准(RFC 4180)和常见做法之间有些不一致。即将发布的3.8版本会支持行头。
因为你在用Python,你可能会觉得APSW Shell很有用(顺便说一下,我是这个工具的作者)。你可以像使用SQLite命令行工具一样从命令行使用它,也可以在程序中使用,包括添加你自己的命令。
值得注意的是,它有一个自动导入的命令,可以自动识别所有内容,包括行头、分隔符、数据类型等等。
sqlite> .help autoimport
.autoimport FILENAME ?TABLE? Imports filename creating a table and
automatically working out separators and data
types (alternative to .import command)
The import command requires that you precisely pre-setup the table and schema,
and set the data separators (eg commas or tabs). In many cases this information
can be automatically deduced from the file contents which is what this command
does. There must be at least two columns and two rows.
If the table is not specified then the basename of the file will be used.
Additionally the type of the contents of each column is also deduced - for
example if it is a number or date. Empty values are turned into nulls. Dates
are normalized into YYYY-MM-DD format and DateTime are normalized into ISO8601
format to allow easy sorting and searching. 4 digit years must be used to
detect dates. US (swapped day and month) versus rest of the world is also
detected providing there is at least one value that resolves the ambiguity.
Care is taken to ensure that columns looking like numbers are only treated as
numbers if they do not have unnecessary leading zeroes or plus signs. This is
to avoid treating phone numbers and similar number like strings as integers.
This command can take quite some time on large files as they are effectively
imported twice. The first time is to determine the format and the types for
each column while the second pass actually imports the data.
1
在执行.import命令之前,你需要先加上以下这行代码:
.separator ,
这行代码的作用是告诉导入命令去寻找分隔符(在这个例子中是逗号)。
想了解更多关于sqlite命令行的内容,可以点击这里查看:http://www.sqlite.org/sqlite.html