在Python中调用多个函数的优雅方式

2024-04-27 11:02:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python进行数据清理。我有下面的工作流程来调用我的所有功能

  if __name__ == "__main__":

       data_file, hash_file, cols = read_file()
       survey_data, cleaned_hash_file = format_files(data_file, hash_file, cols)
       survey_data, cleaned_hash_file = rename_columns(survey_data, cleaned_hash_file)
       survey_data, cleaned_hash_file = data_transformation_stage_1(survey_data, cleaned_hash_file)
       observation, survey_data, cleaned_hash_file = data_transformation_stage_2(survey_data, cleaned_hash_file)
       observation, survey_data, cleaned_hash_file = data_transformation_stage_3(observation, survey_data, cleaned_hash_file)
       observation, survey_data, cleaned_hash_file = observation_date_fill(observation, survey_data, cleaned_hash_file)
       write_file(observation, survey_data, cleaned_hash_file)

因此,每个函数的输出(返回语句变量)被用作后续函数的输入。所有函数都返回dataframe作为输出。所以observationsurvey_datacleaned_hash_filedata_filehash_filecols都是每个函数中使用的数据帧。你知道吗

有没有其他更好更优雅的方式来写这个?你知道吗


Tags: 数据函数name功能dataifhash流程
3条回答

创建此类:

class ProcessingChain:

    def __init__(self, *callables):
        self.operations = callables

    def process(self, *args):
        for operation in self.operations:
            args = operation(*args)
        return args

用法如下:

processing = ProcessingChain(format_files, rename_columns, data_transformation_stage_1, data_transformation_stage_2, data_transformation_stage_3, observation_date_fill)
data_file, hash_file, cols = read_file()
observation, survey_data, cleaned_hash_file = processing.process(data_file, hash_file, cols )
write_file(observation, survey_data, cleaned_hash_file)

您可以扩展pythonmap来接受多个函数的映射,如下所示:

def map_many(iterable, function, *other):
    if other:
        return map_many(map(function, iterable), *other)
    return map(function, iterable)


inputs = read_file()
dfs_1 = map_many(inputs, format_files, rename_column, data_transformation_stage_1, data_transformation_stage_2)
dfs_2 = map_many(dfs_1, data_transformation_stage_3, observation_date_fill)
write_file(*dfs_2)

尝试遍历函数。它假设当前迭代的输入与上一个迭代的输出具有相同的顺序:

funcs = [read_file, format_files, rename_columns, data_transformation_stage_1, data_transformation_stage_2, data_transformation_stage_3, observation_date_fill, write_file]

output = []
for func in funcs:
    output = func(*output)

相关问题 更多 >