MSSQL 游标转 Numpy 数组/数据框 - 类型错误:期望可读缓冲区对象
我正在尝试把一个游标的结果放进一个numpy数组里。我试着创建一个记录,但总是遇到这个错误:TypeError: expected a readable buffer object。
简单来说,我的目标是把数据库查询的结果填充到一个numpy数组或者pandas数据框中,同时保持数据类型不变(所以我指定了dtype的记录对象)。
我在Windows 7的电脑上使用MSSQL 2008 R2,Python版本是2.7。
这是我的表和数据:
CREATE TABLE dbo.records
(
AID int NOT NULL,
ID int NOT NULL,
EID int NOT NULL,
PCODE char(2) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
PVALUE float(53) NOT NULL,
SI float(53) NOT NULL,
SC float(53) NOT NULL,
EE float(53) NOT NULL,
CONSTRAINT PK_records
PRIMARY KEY CLUSTERED (AID,ID,EID,PCODE)
)
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005901, 'GU', 12.886827900148048, 29.206025325689056, 7.301506331422264, 15956.1633 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005902, 'GU', 4.73038822711228, 10.77588154791872, 2.69397038697968, 15922.6067 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005904, 'GU', 22.645755728195648, 50.992083564652032, 12.748020891163008, 15645.0021 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005906, 'GU', 8.925904127986528, 20.274646568966784, 5.068661642241696, 15263.6771 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005913, 'GU', 56.382898744630784, 124.279775860660096, 31.069943965165024, 15985.144 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005914, 'GU', 12.09781233128632, 27.431428572167264, 6.857857143041816, 15950.0621 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005915, 'GU', 44.406190758700864, 98.59373165158272, 24.64843291289568, 15722.7924 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005917, 'GU', 7.133941786197224, 16.22555112649744, 4.05638778162436, 15696.8623 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005918, 'GU', 24.426820608145728, 54.920898223398144, 13.730224555849536, 15326.2144 )
go
INSERT INTO records ( AID, ID, EID, PCODE, PVALUE, SI, SC, EE )
VALUES ( 1, 1, 2005919, 'GU', 16.629517313694592, 37.591771677943552, 9.397942919485888, 15664.831 )
go
这是我的Python脚本。
import numpy as np
import pandas as pd
import pandas.io.sql as psql
import pyodbc as pyodbc
from pandas import DataFrame
from pandasql import sqldf
from pandasql import load_meat
from types import *
from StringIO import StringIO
from itertools import chain
driver_name = "{SQL Server Native Client 10.0}" #"{SQL Server}"
server_name = "(local)"
db_name = "test_db"
analysis_qry = """
SELECT p.AID,
p.EID,
p.PCODE,
p.PVALUE,
p.SI,
p.SC,
p.EE
FROM """+db_name+"""..records p
"""
cnxn = pyodbc.connect('Trusted_connection=yes',driver='{SQL Server Native Client 10.0}',server=server_name,database=db_name)
cursor = cnxn.cursor()
#fetch all the rows back into arrays
cursor.execute(analysis_qry)
results = cursor.fetchall()
D = np.fromiter(chain.from_iterable(results), dtype=('i4, i4, a2, f8, f8, f8, f8'), count=-1)
1 个回答
2
有两种方法可以做到这一点。这两种方法的效果差不多。
方法一 - 转换为numpy数组:
cursor = cnxn.cursor()
cursor.execute(analysis_qry)
results = cursor.fetchall()
num_rows = int(cursor.rowcount)
arr_analysis = np.fromiter((tuple (row) for row in results), dtype='i4,i4,a4,f8,f8,f8,f8', count=num_rows)
方法二 - 转换为pandas数据框:
analysis_df = psql.read_frame(analysis_qry, cnxn)