如何在SQLAlchemy中使用子查询计算移动平均?
我的问题是,我想获取一系列测量值以及这些测量值的移动平均值。我可以用这个SQL语句来做到这一点(这是PostgreSQL的时间间隔语法):
SELECT time, value,
(
SELECT AVG(t2.value)
FROM measurements t2
WHERE t2.time BETWEEN t1.time - interval '5 days' AND t1.time
) moving_average
FROM measurements t1
ORDER BY t1.time;
我想要用SQLAlchemy写出类似的代码来实现这个效果。目前我有这段Python代码:
moving_average_days = # configureable value, defaulting to 5
t1 = Measurements.alias('t1')
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value, select([func.avg(t2.c.value)], t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time))],
t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
order_by(Measurements.c.time)
但是,这段代码生成了这样的SQL:
SELECT t1.time, t1.value, avg_1
FROM measurements AS t1,
(
SELECT avg(t2.value) AS avg_1
FROM measurements AS t2
WHERE t2.time BETWEEN t1.time - %(time_1)s AND t1.time
)
WHERE t1.time > %(time_2)s
ORDER BY t1.time;
这个SQL在FROM部分有一个子查询,但它无法直接访问顶层值的列值,这就导致PostgreSQL报了这个错:
ERROR: subquery in FROM cannot refer to other relations of same query level
LINE 6: WHERE t2.time BETWEEN t1.time - interval '5 days' AN...
所以我想知道的是:我该如何让SQLAlchemy把子查询移动到SELECT部分呢?
另外,还有一种获取移动平均值的方法(不需要为每一个(时间,值)对执行查询),这也是一个选项。
1 个回答
5
好的,显然我需要的是一种叫做 标量选择 的东西。使用这个后,我得到了这段 Python 代码,实际上它能按照我想要的方式工作(生成的 SQL 和我问题中的第一个 SQL 是等价的,这正是我的目标):
moving_average_days = # configurable value, defaulting to 5
ndays = # configurable value, defaulting to 90
t1 = Measurements.alias('t1') ######
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value,
select([func.avg(t2.c.value)],
t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time)).label('moving_average')],
t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
order_by(t1.c.time)
这段代码生成了以下 SQL:
SELECT t1.time, t1.value,
(
SELECT avg(t2.value) AS avg_1
FROM measurements AS t2
WHERE t2.time BETWEEN t1.time - :time_1 AND t1.time
) AS moving_average
FROM measurements AS t1
WHERE t1.time > :time_2 ORDER BY t1.time;