如何简洁地将两个级数相加,但只加正值?

2024-04-25 23:37:15 发布

您现在位置:Python中文网/ 问答频道 /正文

我有两个系列

energy_dict['QLD'] = 

Timestamp
2017-04-27 00:00:00    523.720765
2017-04-27 01:00:00    512.180608
2017-04-27 02:00:00    519.076642
2017-04-27 03:00:00    516.329201
2017-04-27 04:00:00    525.150158
   ...                 ...
Freq: H, Name: QLD Total Energy (MWh), Length: 8760, dtype: float64

以及

Incoming_Flow = 

Timestamp
2017-04-27 00:00:00    -8.961111
2017-04-27 01:00:00     9.503472
2017-04-27 02:00:00   -10.776389
2017-04-27 03:00:00     1.451389
2017-04-27 04:00:00   -10.388195
        ...               ...

频率:H,名称:METEREDMWFLOW N-Q-MNSP1,长度:8760,数据类型:float64

我想把它们加在一起,但是只有当第二个大于零时。最好的方法是什么?你知道吗

我知道我可以这样做

Incoming_Flow[Incoming_Flow < 0 ] = 0

但我希望能在一条线上完成这一切


Tags: nameflowlengthdicttimestampenergy频率total
3条回答

更快地使用numpy add和where

import numpy as np

qld = [523.720765, 512.180608, 519.076642, 516.329201, 525.150158]
flow = [ -8.961111,   9.503472, -10.776389,   1.451389, -10.388195]

df1 = pd.DataFrame(qld, columns=['QLD'])
df2 = pd.DataFrame(flow, columns=['Incoming_Flow'])

s = np.add(df1['QLD'], np.where(df2['Incoming_Flow'] > 0, df2['Incoming_Flow'], 0))

print(s)

0    523.720765
1    521.684080
2    519.076642
3    517.780590
4    525.150158

时间安排:

s1 = pd.Series(np.arange(50000))
s2 = pd.Series(np.random.randint(-4, 10,50000))

%timeit s1.add(s2.where(s2.gt(0), 0))
890 µs ± 58.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit np.add(s1, np.where(s2 > 0, s2, 0))
367 µs ± 6.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

也可以使用^{}^{}

s = energy_dict['QLD'].add(Incoming_Flow.where(Incoming_Flow.gt(0), 0))

如果性能很重要,这也比mask解决方案快约18%:

[证明]

s1 = pd.Series(np.arange(50000))
s2 = pd.Series(np.random.randint(-4, 10,50000))

%timeit s1.add(s2.mask(s2 < 0, 0), fill_value=0)
1.17 ms ± 25.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit s1.add(s2[s2 > 0], fill_value=0)
4.68 ms ± 289 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit s1.add(s2.where(s2.gt(0), 0))
988 µs ± 50.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

^{}^{}一起使用:

s = energy_dict['QLD'].add(Incoming_Flow.mask(Incoming_Flow < 0, 0), fill_value=0)
print (s)
0    523.720765
1    521.684080
2    519.076642
3    517.780590
4    525.150158
dtype: float64

print (Incoming_Flow.mask(Incoming_Flow < 0, 0))
0    0.000000
1    9.503472
2    0.000000
3    1.451389
4    0.000000
Name: METEREDMWFLOW N-Q-MNSP1, dtype: float64

或筛选序列并使用参数fill_value=0

fill_value : None or float value, default None (NaN)

Fill existing missing (NaN) values, and any new element needed for successful Series alignment, with this value before computation. If data in both corresponding Series locations is missing the result will be missing

s = energy_dict['QLD'].add(Incoming_Flow[Incoming_Flow > 0], fill_value=0)
print (s)
0    523.720765
1    521.684080
2    519.076642
3    517.780590
4    525.150158
dtype: float64

细节

print (Incoming_Flow[Incoming_Flow > 0])
1    9.503472
3    1.451389
Name: METEREDMWFLOW N-Q-MNSP1, dtype: float64

编辑:

如果性能很重要,请使用^{}

s = pd.Series(np.where(Incoming_Flow < 0, 0, Incoming_Flow ), index=Incoming_Flow.index)
#if DatetimeIndex values are same in both Series 
s = np.where(Incoming_Flow < 0, 0, Incoming_Flow )
energy_dict['QLD'].add(s, fill_value=0)

相关问题 更多 >