我从数据集1.xlsx中收集的数据帧如下所示:
TimePoint Object 0 Object 1 Object 2 Object 3 Object 4 Object 0 Object 1 Object 2 Object 3 Object 4
0 10 4642.99 2000.71 4869.52 4023.69 3008.99 11188.15 2181.62 12493.47 10275.15 8787.99
1 20 4640.09 2005.17 4851.07 4039.73 3007.16 11129.38 2172.37 12438.31 10218.92 8723.45
问题: 数据包含具有重复名称的标题列。需要将其聚合以查找引用,然后初始化每个对象的IDA和IAA值
根据这些新值,需要计算Fc和EAPP值。因此,最终excel输出应如下所示:
TimePoint Objects IDA IAA Fc (using IDA- (a * IAA)) EAPP (using Fc/ (Fc + (G *Fc)))
10 Object 0 4642.99 11188.15 3300.412 0.463177397
10 Object 1 2000.71 2181.62 -527.78758 1
10 Object 2 4869.52 12493.47 4869.52 1
10 Object 3 4023.69 10275.15 4023.69 1
10 Object 4 3008.99 8787.99 3008.99 1
20 Object 0 4640.09 11129.38 4640.09 1
20 Object 1 2005.17 2172.37 2005.17 1
20 Object 2 4851.07 12438.31 4851.07 1
20 Object 3 4039.73 10218.92 4039.73 1
20 Object 4 3007.16 8723.45 3007.16 1
我尝试使用以下python脚本解决此问题:
def main():
all_data = pd.DataFrame()
a = 0.12
G = 1.159
for f in glob.glob("data/dataset1.xlsx"):
df = pd.read_excel(f, 'Sheet1') # , header=[1]
all_data = all_data.append(df, ignore_index=True, sort=False)
all_data.columns = all_data.columns.str.split('.').str[0]
print(all_data)
object_df = all_data.groupby(all_data.columns, axis=1)
print(object_df)
for k in object_df.groups.keys():
if k != 'TimePoint':
for row_index, row in object_df.get_group(k).iterrows():
print(row)
# This logic is not working to group by Object and then apply the Following formula
# TODO: Calculation for the new added columns Assumption every time there will be two occurrence of any
# Object i.e. Object 0...4 in this example but Object count can varies sometime only one Object can
# appear
# IDA is the first occurrence value of the Object
all_data['IDA'] = row[0] # This is NOT correct
# IAA is the second occurrence value of the Object
all_data['IAA'] = row[1]
all_data['Fc'] = all_data.IDA.fillna(0) - (a * all_data.IAA.fillna(0))
all_data['EAPP'] = all_data.Fc.fillna(0) / (all_data.Fc.fillna(0) + (G * all_data.Fc.fillna(0)))
# now save the data frame
writer = pd.ExcelWriter('data/dataset1.xlsx')
all_data.to_excel(writer, 'Sheet2', index=True)
writer.save()
if __name__ == '__main__':
main()
请让我知道如何在上面的代码中使用groupby为每个对象分配IDA和IAA值
我想melt可能对你有很大帮助
相关问题 更多 >
编程相关推荐