如何使用Python将csv文件中的数据导入Sqlite3中的my表

connection = _sqlite3.connect("db_name.sqlite") cursor = connection.cursor() with open('path_to_csv', 'r') as file: no_records = 0 for row in file: cursor.execute("INSERT INTO table_name (index_of, high, low, original, ship_date) VALUES (?,?,?,?,?)", row.split(",")) connection.commit() no_records += 1 connection.close()

2条回答

网友
1楼 · 编辑于 2024-05-14 11:14:38

您需要为第5列提供默认值
如果在每个SQL语句中插入100-200行的块，还可以提高脚本的性能

网友
2楼 · 编辑于 2024-05-14 11:14:38

user3380595已经在their answer中指出，您需要为列index_of提供一个值
cursor.execute(""" INSERT INTO Quotes (index_of, high, low, original, ship_date) VALUES (?, ?, ?, ?, ?) """, [index, *row])
我创建了200000行测试数据，加载速度非常快（不到2秒）。请参阅使用^{}和^{}的第一个示例
正如user3380595所提到的，如果您关心内存和性能，可以分块加载数据。这个场景实际上加载得稍微慢一点。参见使用^{}和^{}的第二个示例
使用csv和sqlite3
设置测试环境
import csv import sqlite3 import contextlib import pandas as pd test_data = r"/home/thomas/Projects/Playground/stackoverflow/data/test.csv" test_db = r"/home/thomas/Projects/Playground/stackoverflow/data/test.db" with contextlib.closing(sqlite3.connect(test_db)) as connection: cursor = connection.cursor() cursor.execute("DROP TABLE IF EXISTS Quotes;") cursor.execute(""" CREATE TABLE IF NOT EXISTS Quotes ( index_of INTEGER, PRIMARY KEY, high REAL, low REAL, original REAL, ship_date TEXT ); """) connection.commit()
加载数据
with contextlib.closing(sqlite3.connect(test_db)) as connection: cursor = connection.cursor() with open(test_data, "r") as file: for index, row in enumerate(csv.reader(file)): cursor.execute(""" INSERT INTO Quotes (index_of, high, low, original, ship_date) VALUES (?, ?, ?, ?, ?) """, [index, *row]) connection.commit()
使用pandas和sqlalchemy
设置测试环境
import pandas as pd from sqlalchemy import create_engine test_data = r"/home/thomas/Projects/Playground/stackoverflow/data/test.csv" test_db = r"sqlite:////home/thomas/Projects/Playground/stackoverflow/data/test.db" engine = create_engine(test_db, echo=True) with engine.begin() as connection: engine.execute("DROP TABLE IF EXISTS Quotes;") engine.execute(""" CREATE TABLE IF NOT EXISTS Quotes ( index_of INTEGER, PRIMARY KEY, high REAL, low REAL, original REAL, ship_date TEXT ); """)
加载数据（以块为单位）
with engine.begin() as connection: reader = pd.read_csv(test_data, iterator=True, chunksize=50000) for chunk in reader: chunk["index_of"] = chunk.index chunk.to_sql("Quotes", con=engine, if_exists="append", index=False)
或者，您也可以使用^{}来处理行块，而不是使用pandas
或者，您可以使用^{}（示例假定现有数据库带有表引号）并并行写入数据。然而，我不认为你的数据需要它
import dask.dataframe as dd test_data = r"/home/thomas/Projects/Playground/stackoverflow/data/test.csv" test_db = r"sqlite:////home/thomas/Projects/Playground/stackoverflow/data/test.db" df = dd.read_csv(test_data) # , blocksize=2e6 df["index_of"] = df.index df.to_sql("Quotes", uri=test_db, if_exists="append", index=False, parallel=True)

相关问题更多 >

编程相关推荐

热门问题

热门文章