两个数据集的python生成器表达式

2024-04-26 21:54:12 发布

您现在位置:Python中文网/ 问答频道 /正文

#Find values that are in range
in_range = [lo_lim <= v <= hi_lim for v in values]
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]

#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)

我试图扩充这段代码,得到两组值和两对hi/lo极限, 为了计算在这些限制内花费的总时间,当同一点的“限制内”情况为真时的组合限制,即

如果有100个数据点(两个数据集长度相同,请检查每个点

if values_1[45] and values_2[45] are in their respective limits

然后算在范围内。 本质上把这个if转换成一个生成器表达式:

if lo_lim_1<=Data_Points_1[i]<=hi_lim_1 and lo_lim_2<=Data_Points_2[i]<=hi_lim_2:

计算运行次数,如果运行长度为一个数据点,则应用缓冲区,否则应用采样率转换。你知道吗


Tags: 数据inloforifgrouprunsrange
1条回答
网友
1楼 · 发布于 2024-04-26 21:54:12

如果我理解你的问题,这应该管用。基本思想是zip将两个序列组合成一对对应的值,然后使用and操作查找它们都在相应范围内的情况:

#Find values that are in range
in_range = [lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2)]

# code is unchanged from here
#Find runs of in-range values
runs = [sum(1 for _ in group) for v, group in groupby(in_range) if v]  # this is the same as yours

#Estimate total time spent in-range
total_time = sum(v if v > 1 else (Buffer_Value*sample_rate) for v in runs)

在本例中,如果您使用的是python 2.x,则可以使用itertools.izip而不是zip来节省一些内存,对于python 2.x和3.x,可以使用生成器表达式来节省更多内存:

#Find values that are in range
in_range = (lo_lim1 <= v1 <= hi_lim1 and lo_lim2 <= v2 <= hi_lim2 for v1, v2 in zip(values1, values2))

#Find runs of in-range values
runs = (sum(1 for _ in group) for v, group in groupby(in_range) if v)  # this is the same as yours

#Estimate total time spent in-range
defval = Buffer_Value*sample_rate
total_time = sum(v if v > 1 else defval for v in runs)

相关问题 更多 >