Psycopg2从csv复制到postgress

2024-04-20 03:38:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个csv文件,我读到熊猫,我应该插入到postgres。该文件在某些字段中包含带有反斜杠“字符”的字符串。这会导致问题,因为copy_from函数将其作为转义字符读取。如何让它忽略“”并将其保留为字符串。我尝试了许多不同的编码格式,但仍然出现“无法解码字符”错误。问题是我不能替换那个字符,它在字符串中很重要

def load_into_db(cur, con, file,table_name):
f = open(file, mode="r", encoding='utf-8')
try:
    # print("wrote to csv")
    sqlstr = "COPY {} FROM STDIN DELIMITER '|' CSV".format(table_name)
    cur.copy_from(f, table_name, null="nan", sep="|")
    con.commit()
    f.close() 
except Exception as e:
    print(e)
    print("something went wrong")

导致问题的行的示例

^{tb1}$

错误:编码“UTF8”的字节序列无效:0xa2


Tags: 文件csv字符串namefrom编码错误table
1条回答
网友
1楼 · 发布于 2024-04-20 03:38:03
import io
import csv
def df2db(df_a, table_name, engine):
    output = io.StringIO()
    # ignore the index
    # df_a.to_csv(output, sep='\t', index = False, header = False, quoting=csv.QUOTE_NONE)
    df_a.to_csv(output, sep='\t', index = False, header = False, quoting=csv.QUOTE_NONE, escapechar='\\')
    output.getvalue()
    # jump to start of stream
    output.seek(0)
    
    #engine < - from sqlalchemy import create_engine
    connection = engine.raw_connection() 
    cursor = connection.cursor()
    # null value become ''
    cursor.copy_from(output,table_name,null='')
    connection.commit()
    cursor.close()

使用函数df2dbDataFrame插入到现有表中,因为表的列和df的列应该相同

import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('postgresql+psycopg2://user:psw@localhost:5432/dbname')
df = pd.read_csv(file)
df2db(df, table_name, engine)

相关问题 更多 >