如何在groupby上应用自己的构建函数

EMBieding AeolisBieding ... Diff_EM Diff_Aeolis StartTime ... 2019-09-01 00:00:00 3058.24 3494.0 ... -3126.24 -3562.0 2019-09-01 01:00:00 2906.01 3480.0 ... -2974.01 -3548.0 2019-09-01 02:00:00 2836.22 3470.0 ... -2903.22 -3537.0 2019-09-01 03:00:00 2805.66 3448.0 ... -2848.66 -3491.0 2019-09-01 04:00:00 2541.54 3413.0 ... -2606.54 -3478.0

EMBieding AeolisBieding ... Diff_EM Diff_Aeolis StartTime ... 0 1175.862033 1279.577236 ... -253.707561 -357.422764 1 1153.947724 1264.723577 ... -309.435528 -420.211382 2 1146.239016 1259.459016 ... -336.763607 -449.983607 3 1133.350976 1251.268293 ... -390.928211 -508.845528 4 1127.061789 1251.300813 ... -405.411382 -529.650407

# statistic calculates the different errormeasurements: NBIAS,NMAE,NRMSE. Input arguments are: data; this is the output from the # importdata function. parksize; which is just the installed power of the respective farm, for normalization. filename # is needed to produce a unique new filename. def statistic(data,park_size,filename): def NBIAS(Diff_forecaster,park_size): return data[Diff_forecaster].mean()/park_size def NMAE(Bied_forecaster,park_size): return mean_absolute_error(data['Production'], data[Bied_forecaster]) /park_size def NRMSE(Bied_forecaster,park_size): return (sqrt(mean_squared_error(data['Production'], data[Bied_forecaster])) /np.square(park_size)) # Calculate the overall errormeasure and save it directly in a external .csv ErrorMeasure = {'EM':[NBIAS('Diff_EM',park_size),NMAE('EMBieding',park_size),NRMSE('EMBieding',park_size)], 'Aeolis':[NBIAS('Diff_Aeolis',park_size),NMAE('Bied',park_size ),NRMSE('Bied',park_size)]} df_ErrorMeasure = pd.DataFrame(ErrorMeasure,index=['NBIAS','NMAE','NRMSE']) df_ErrorMeasure.to_csv('errormeasure'+filename) data_perhour=data.groupby(data.index.hour).apply(NBIAS('EMBieding',park_size)) print(data_perhour)

2条回答

网友

1楼 · 编辑于 2024-05-23 13:40:42

NBIAS返回平均值（浮点）除以park_size。这是一个数字，正如错误消息所说的numpy.float64。apply接受可调用函数，例如函数或lambda。你知道吗

相反，请尝试：


data_perhour=data.groupby(data.index.hour).apply(lambda p: NBIAS('EMBieding',p))

网友

2楼 · 编辑于 2024-05-23 13:40:42

Pandasgroupbyapply接受一个可调用的参数，该参数接收与组对应的数据帧的子集。您的问题是NBIAS函数没有相应的参数，并且作用于原始数据帧。你知道吗

为了在groupby中使用它，您需要调整它：

def statistic(data,park_size,filename):
    def NBIAS(Diff_forecaster,park_size, df=data):
        return df[Diff_forecaster].mean()/park_size

然后你可以这样使用它：

data_perhour=data.groupby(data.index.hour).apply(lambda subdf: NBIAS('EMBieding',park_size, subdf))

print(data_perhour)

相关问题更多 >

编程相关推荐

热门问题

热门文章