如何将Sqlalchemy表对象转换为Pandas DataFrame?

10 投票
4 回答
12832 浏览
提问于 2025-04-18 16:59

有没有办法把从SqlAlchemy获取的表对象转换成Pandas的DataFrame,还是说我需要为这个目的写一个特定的函数呢?

4 个回答

-1

我有一个更简单的方法:

# Step1: import
import pandas as pd
from sqlalchemy import create_engine

# Step2: create_engine
connection_string = "sqlite:////absolute/path/to/database.db"
engine = create_engine(connection_string)

# Step3: select table
print (engine.table_names())

# Step4: read table
table_df = pd.read_sql_table('table_name', engine)
table_df.head()

关于其他类型的 connection_string,可以查看 SQLAlchemy 1.4 文档

0

Pandas数据库功能,比如read_sql_query,可以使用SQLAlchemy连接对象(也就是所谓的SQLAlchemy 可连接对象,具体可以查看pandas文档sqlalchemy文档)。下面是一个使用名为my_connection的连接对象的例子:

import pandas as pd
import sqlalchemy

# create SQLAlchemy Engine object instance 
my_engine = sqlalchemy.create_engine(f"{dialect}+{driver}://{login}:{password}@{host}/{db_name}")

# connect to the database using the newly created Engine instance
my_connection = my_engine.connect()

# run SQL query
my_df = pd.read_sql_query(sql=my_sql_query, con=my_connection)
6

我觉得我之前试过这个方法。虽然有点不太正规,但对于整个表的ORM查询结果,这个方法应该能用:

import pandas as pd

cols = [c.name for c in SQLA_Table.__table__.columns]
pk = [c.name for c in SQLA_Table.__table__.primary_key]
tuplefied_list = [(getattr(item, col) for col in cols) for item in result_list]

df = pd.DataFrame.from_records(tuplefied_list, index=pk, columns=cols)

如果是部分查询结果(命名元组),也可以使用,不过你需要根据你的查询来构建DataFrame的 columnsindex,让它们匹配。

16

这可能不是最有效的方法,但我用automap_base来映射数据库表,然后把它转换成Pandas的DataFrame,这样做对我来说是有效的。

import pandas as pd
from sqlalchemy.ext.automap import automap_base
from sqlalchemy import create_engine
from sqlalchemy.orm import Session

connection_string = "your:db:connection:string:here"
engine = create_engine(connection_string, echo=False)
session = Session(engine)

# sqlalchemy: Reflect the tables
Base = automap_base()
Base.prepare(engine, reflect=True)

# Mapped classes are now created with names by default matching that of the table name.
Table_Name = Base.classes.table_name

# Example query with filtering
query = session.query(Table_Name).filter(Table_Name.language != 'english')

# Convert to DataFrame
df = pd.read_sql(query.statement, engine)
df.head()

撰写回答