使用to_sql将数据帧中的数据大容量插入Sybase数据库表失败

2024-05-15 12:53:27 发布

您现在位置:Python中文网/ 问答频道 /正文

下面代码的目的是从restful服务获取数据,对其进行规范化,将其存储在带有必要列的dataframe中,然后使用Pandasto_sql将其加载到Sybase表中

错误:

File "C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 467, in do_executemany cursor.executemany(statement, parameters) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]Incorrect syntax near ','.\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [parameters: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420/KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0), ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW', 0), ('10TBA/KFX', 'KFX', 'KRW', 0) ... displaying 10 of 4525 total bound parameter sets ... ('ZURF/DTB', 'DTB', 'CHF', 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]

Process finished with exit code 1

代码:

from sqlalchemy.engine.url import *               
from sqlalchemy.connectors.pyodbc import *             
from sqlalchemy import create_engine                       
import urllib.request as request                  
import json                         
import pandas as pd                      
from pandas.io.json import json_normalize, DataFrame      
           
response = request.urlopen('http://tfsdscsw5XX/mdsclass/CONTFUTURES--O.json')            
output=response.read()                              
data=json.loads(output)           
df=json_normalize(data)                           
df1=(df[['CONTRACT_ID','EXCHANGE_ID','CURRENCY','TRADING_CODE']])                
df2=pd.DataFrame(df1)           
print(df2)                
print(df2.CONTRACT_ID)          
            
connector =  PyODBCConnector()                 
url = make_url("sybase+pyodbc://myhost/mydatabase?driver=Adaptive Server Enterprise&port=2306")              
print(connector.create_connect_args(url))                         
engine=create_engine(url)

#it is failing here**
df2.to_sql("contract_test",engine,index=False,if_exists="append",schema="dbo")   

response.close()               

数据帧df2中的数据示例:

      CONTRACT_ID EXCHANGE_ID CURRENCY  TRADING_CODE
0      0050/TAIEX       TAIEX      TWD             0
1     035420/KORE        KORE      KRW             0
2         0TL/LIF         LIF      NOK             1
3     100FTSE/LIF         LIF      GBP             0
4     101FTSE/LIF         LIF      GBP             0

表2.1.1.1合同测试定义:

CREATE TABLE contract_test (
    CONTRACT_ID char(12) NOT NULL,
    EXCHANGE_ID char(12),
    CURRENCY char(4) NOT NULL,
    TRADING_CODE smallint
) 
GO

请帮助了解如何解决此问题?我被困在这里了


Tags: importidjsonurlexchangesqlalchemycodeengine
2条回答

您的问题可能只是Python数据库API的不兼容性。Pandas的to_sql实际上正在运行来自pyodbcexecutemany()调用。此模块在SQL Server中使用更为普遍,尤其是在使用SQLAlchemy的实现中。但是,不完全支持与Sybase的集成。正如SQLAlchemy Sybase docs page中提到的:

Note

The Sybase dialect within SQLAlchemy is not currently supported. It is not tested within continuous integration and is likely to have many issues and caveats not currently handled. Consider using the external dialect instead.

具体而言,executemany似乎正在运行多个VALUES行插入,SQL Server支持这些插入,但Sybase不支持(尽管这两种方言都是TSQL的变体,具有已知的连接历史):

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0), 
       ('035420/KORE', 'KORE', 'KRW', 0), 
       ('0TL/LIF', 'LIF', 'NOK', 1), 
...

相反,Sybase需要具有多个INSERT INTO调用的经典ANSI-SQL:

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0) 
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('035420/KORE', 'KORE', 'KRW', 0)
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0TL/LIF', 'LIF', 'NOK', 1)
...

解决,而不是熊猫的方便^ {< CD1>}方法,考虑一个直接的SqLalCyy ^ {< CD4>},使用参数的数据帧行通过^{}调用。下面假设contract_test表始终预先存在

engine = create_engine(url)
sql = """INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
         VALUES (?, ?, ?, ?)"""

with engine.connect() as connection:
    result = connection.execute(sql, df2.to_numpy().tolist())

如果上述问题仍然存在,请集成for循环:

with engine.connect() as connection:
    for row in df2.to_numpy().tolist():
        result = connection.execute(sql, row)

external SAP ASE (Sybase) dialect现在是Sybase推荐的SQLAlchemy方言,如果使用SAP ASE ODBC驱动程序,它确实支持fast_executemany

相关问题 更多 >