从所有数据帧条目中减去一个值

2024-06-16 11:17:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个大数据框,它包含以开尔文为单位的温度。我想把所有的温度数据转换成摄氏度。我找不到任何做一次减法的例子

这是我的数据框:

                    Antwerp     Busan       Colombo     Dalian      Guangzhou   Hamburg     Hong Kong   Jebel      Ali/Dubai    Kaohsiung   Laem Chabang    ... Rotterdam   Shanghai    Shenzhen    Singapore   Tanjung Pelepas Tanjung Priok/Jakarta   Tianjin Xiamen  Yingkou 
time                                                                                    
1990-01-01 00:00:00 273.70395   279.31912   298.03195   268.42200   285.93228   271.31534   290.31357   289.83023   292.94135   298.48724   ... 274.18726   279.60450   288.37366   298.10950   298.23816   299.37143   272.06094   285.92570   265.19046   
1990-01-01 01:00:00 273.72702   279.94266   298.02042   268.18445   286.04940   271.18503   290.59730   289.69333   292.95950   298.01053   ... 274.12128   280.13235   288.59967   298.21176   298.40808   299.59576   272.04776   286.36612   265.10303   
1990-01-01 02:00:00 273.47134   280.65198   298.40310   269.00925   286.67624   271.22790   291.18784   289.33700   293.10632   301.11172   ... 273.94310   282.45330   289.25455   298.39322   298.64725   300.08075   272.84616   287.74683   265.73150 

我只想从每个城市列中减去273,而不包括时间列


Tags: 数据单位温度例子hongkong摄氏度guangzhou
2条回答

表单示例数据似乎是DatetimeIndex,因此仅减去标量值:

df = df.sub(273.15)

如果time是列:

df = df.set_index('time').sub(273.15)

或者,如果第一列是time列:

df.iloc[:, 1:] = df.iloc[:, 1:].sub(273.15)

300k行的性能:

df = pd.concat([df] * 100000)
print (df)

In [170]: %timeit df.set_index('time').applymap(lambda value:value-273)
1.9 s ± 16.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [171]: %timeit df.set_index('time').sub(273.15)
95.6 ms ± 575 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

样本数据:

df = pd.DataFrame({'time': [pd.Timestamp('1990-01-01 00:00:00'), pd.Timestamp('1990-01-01 01:00:00'), pd.Timestamp('1990-01-01 02:00:00')], 'Antwerp': [273.70395, 273.72702000000004, 273.47134], 'Busan': [279.31912, 279.94266, 280.65198], 'Colombo': [298.03195, 298.02042, 298.4031], 'Dalian': [268.422, 268.18445, 269.00925]})
print (df)
                 time    Antwerp      Busan    Colombo     Dalian
0 1990-01-01 00:00:00  273.70395  279.31912  298.03195  268.42200
1 1990-01-01 01:00:00  273.72702  279.94266  298.02042  268.18445
2 1990-01-01 02:00:00  273.47134  280.65198  298.40310  269.00925

如果“时间”不是索引:

df = df.set_index('time').applymap(lambda value:value-273).reset_index()

否则

df = df.applymap(lambda value: value-273)

applymap()对数据帧的每个值(索引除外)应用任何函数

相关问题 更多 >