列表中的平均时间差
我想计算一个日期列表中日期之间的平均时间差。虽然下面的方法效果不错,但我在想有没有更聪明的办法?
delta = lambda last, next: (next - last).seconds + (next - last).days * 86400
total = sum(delta(items[i-1], items[i]) for i in range(1, len(items)))
average = total / (len(items) - 1)
4 个回答
3
试试这个:
from itertools import izip
def average(items):
total = sum((next - last).seconds + (next - last).days * 86400
for next, last in izip(items[1:], items))
return total / (len(items) - 1)
我觉得这样写更容易读懂。对于那些数学基础不太好的读者,添加一个注释可以帮助解释你是如何计算每个增量的。顺便说一下,我看到的所有方法中,使用一个生成器表达式的指令是最少的(而且我觉得执行速度也最慢)。
# The way in your question compiles to....
3 0 LOAD_CONST 1 (<code object <lambda> at 0xb7760ec0, file
"scratch.py", line 3>)
3 MAKE_FUNCTION 0
6 STORE_DEREF 1 (delta)
4 9 LOAD_GLOBAL 0 (sum)
12 LOAD_CLOSURE 0 (items)
15 LOAD_CLOSURE 1 (delta)
18 BUILD_TUPLE 2
21 LOAD_CONST 2 (<code object <genexpr> at 0xb77c0a40, file "scratch.py", line 4>)
24 MAKE_CLOSURE 0
27 LOAD_GLOBAL 1 (range)
30 LOAD_CONST 3 (1)
33 LOAD_GLOBAL 2 (len)
36 LOAD_DEREF 0 (items)
39 CALL_FUNCTION 1
42 CALL_FUNCTION 2
45 GET_ITER
46 CALL_FUNCTION 1
49 CALL_FUNCTION 1
52 STORE_FAST 1 (total)
5 55 LOAD_FAST 1 (total)
58 LOAD_GLOBAL 2 (len)
61 LOAD_DEREF 0 (items)
64 CALL_FUNCTION 1
67 LOAD_CONST 3 (1)
70 BINARY_SUBTRACT
71 BINARY_DIVIDE
72 STORE_FAST 2 (average)
75 LOAD_CONST 0 (None)
78 RETURN_VALUE
None
#
#doing it with just one generator expression and itertools...
4 0 LOAD_GLOBAL 0 (sum)
3 LOAD_CONST 1 (<code object <genexpr> at 0xb777eec0, file "scratch.py", line 4>)
6 MAKE_FUNCTION 0
5 9 LOAD_GLOBAL 1 (izip)
12 LOAD_FAST 0 (items)
15 LOAD_CONST 2 (1)
18 SLICE+1
19 LOAD_FAST 0 (items)
22 CALL_FUNCTION 2
25 GET_ITER
26 CALL_FUNCTION 1
29 CALL_FUNCTION 1
32 STORE_FAST 1 (total)
6 35 LOAD_FAST 1 (total)
38 LOAD_GLOBAL 2 (len)
41 LOAD_FAST 0 (items)
44 CALL_FUNCTION 1
47 LOAD_CONST 2 (1)
50 BINARY_SUBTRACT
51 BINARY_DIVIDE
52 RETURN_VALUE
None
特别是,去掉lambda函数可以让我们避免创建一个闭包、构建一个元组和加载两个闭包。无论如何,五个函数都会被调用。当然,过于关注性能有点过头,但知道底层是怎么运作的也不错。最重要的是可读性,我觉得这样做在可读性上也得分很高。
8
如果你有一个时间差的列表:
import pandas as pd
avg=pd.to_timedelta(pd.Series(yourtimedeltalist)).mean()
77
顺便问一下,如果你手里有一堆时间差(timedelta)或者日期时间(datetime),那你为什么还要自己去算呢?
import datetime
datetimes = [ ... ]
# subtracting datetimes gives timedeltas
timedeltas = [datetimes[i-1]-datetimes[i] for i in range(1, len(datetimes))]
# giving datetime.timedelta(0) as the start value makes sum work on tds
average_timedelta = sum(timedeltas, datetime.timedelta(0)) / len(timedeltas)