SQLalchemy: 列值组合的所有排列的分位数
我们有一个 SQL Server 查询,需要为越来越多的变量生成 ntiles(分位数),这些变量会以不同的组合方式互相结合。下面是一个例子,来说明我想表达的意思:
语句 1:
ntile(10) over (partition by MAUorALL, User_Type, fsi.Month_ID
order by Objects_Created) AS Ntile_Mon_Objects_Created,
语句 2:
ntile(10) over (partition by MAUorALL, User_Type, fsi.Month_ID, *Country*
order by Objects_Created) AS Ntile_Country_Objects_Created
语句 3:
ntile(10) over (partition by MAUorALL, User_Type, fsi.Month_ID, *User*_Type
order by Objects_Created) AS Ntile_UT_Objects_Created
你可以看到,这些语句基本上是一样的,除了第二和第三个语句中,斜体的“country”(国家)和“user type”(用户类型)这两列是新加的。所以我们对同一个变量“Objects_Created”(创建的对象)在不同的细节层次上计算 ntiles,同时也需要对这些变量的各种可能组合进行 ntiles 的计算,比如:
语句 4:
ntile(10) over (partition by MAUorALL, User_Type, fsi.Month_ID, *Country, User_Type*
order by Objects_Created) AS Ntile_Country_UT_Objects_Created
我们可以手动编写这些组合的代码,但如果能用 sqlalchemy 来执行这些变量的所有组合,可能会让事情变得简单一些。有没有人能提供一个我可以改用的例子?
谢谢你的帮助!
1 个回答
0
我不知道 fsi
和其他列有什么关系,但假设所有数据都在一个模型里(用 sqlalchemy
查询很容易扩展),就像下面这样:
class User(Base):
__tablename__ = 't_users'
id = Column(Integer, primary_key=True)
MAUorALL = Column(String)
User_Type = Column(String)
Country = Column(String)
Month_ID = Column(Integer)
Objects_Created = Column(Integer)
这个任务可以通过简单使用 itertools.permutations
(或者 itertools.combinations
,具体取决于你想要实现什么)来完成查询。下面的代码会为 User
表生成一个包含各种 ntiles
的查询。我想看代码就能理解发生了什么:
# configuration: {label: Column}
column_labels = {
'Country': User.Country,
'UT': User.User_Type,
}
def get_ntile(additional_columns=None):
""" @return: sqlalchemy expression for selecting a given ntile() using
predefined as well as *additional* columns.
"""
partition_by = [
User.MAUorALL,
User.User_Type,
User.Month_ID,
]
label = "Ntile_Objects_Created"
if additional_columns:
lbls = []
for col_name in additional_columns:
col = column_labels[col_name]
partition_by.append(col)
lbls.append(col_name)
label = "Ntile_{}_Objects_Created".format("_".join(lbls))
xprs = over(
func.ntile(10),
partition_by = partition_by,
order_by = User.Objects_Created,
).label(label)
return xprs
def get_query(additional_columns=['UT', 'Country']):
""" @return: a query object which selects a User with additional ntiles
for predefined columns (fixed) and all possible permutations of
*additional_columns*
"""
from itertools import permutations#, combinations
tiles = [get_ntile(comb)
for r in range(len(additional_columns) + 1)
for comb in permutations(additional_columns, r)
]
q = session.query(User, *tiles)
return q
q = get_query()
print [_c["name"] for _c in q.column_descriptions]
# >>> ['User', 'Ntile_Objects_Created', 'Ntile_UT_Objects_Created', 'Ntile_Country_Objects_Created', 'Ntile_UT_Country_Objects_Created', 'Ntile_Country_UT_Objects_Created']
for tile in q.all():
print tile