用于创建和合并数据库透视表的Python Pandas ForLoop

如何为所有列创建和合并透视表创建For循环？在

考虑这个df（实际数据集类似于50个估计列；我在这里简化为4个）：

import numpy as np import pandas as pd raw_data = {'Year': [2009, 2009, 2010, 2010, 2010, 2010], 'Quarter': [4, 4, 1, 1, 2, 2], 'Sector': ['GPU', 'GPU', 'Gaming', 'Gaming', 'Gaming', 'Gaming'], 'Ticker': ['NVID', 'NVID', 'ATVI', 'ATVI', 'ATVI', 'ATVI'], 'Metric': ['EPS', 'REV', 'EPS', 'REV', 'EPS', 'REV'], 'Estimate 1': [1.4, 350, 0.2, 500, 0.9, 120], 'Estimate 2': [1.2, 375, 0.22, 505, 1.0, 120], 'Estimate 3': [2.1, 250, 0.2, 510, 0.8, 120], 'Estimate 4': [1.4, 360, 0, 400, 1.9, 125],} df = pd.DataFrame(raw_data, columns = ['Year','Quarter', 'Sector','Ticker', 'Metric','Estimate 1','Estimate 2','Estimate 3', 'Estimate 4']) print(df)

期望输出-我正在寻找这样的测向：

我可以通过使用pd.pivot()和pd.merge单独完成这项工作，但不确定如何构造这是一个for循环。在

feature_names=('Year','Quarter','Sector','Ticker') not_feature_names=['Metric','Estimate 1','Estimate 2','Estimate 3', 'Estimate 4'] df_pivot=df.drop(not_feature_names, axis=1) df_pivot1 = df.pivot_table(index=feature_names, columns='Metric', values='Estimate 1',) df_pivot1 = df_pivot1.reset_index().rename_axis(None, axis=1) df_pivot1.rename(columns={'EPS': 'EPS_1', 'REV':'REV_1'}, inplace=True) df_Full=df_pivot1.merge(df_pivot, on=(feature_names), suffixes=('_l', '_r')) print(df_Full)

我在这里的循环：

for (name, i) in zip(not_feature_names, range(1, 4)): df_pivot1 = df.pivot_table(index=feature_names, columns='Metric', values=name,) df_pivot1 = df_pivot1.reset_index().rename_axis(None, axis=1) df_pivot1.rename(columns={'EPS': ('EPS_'+i), 'REV':('REV_'+i)}, inplace=True) df_Full=df_pivot1.merge(df_pivot, on=(feature_names), suffixes=('_l', '_r')

2条回答

网友

1楼 · 编辑于 2024-05-14 08:38:04

详细的方法是熔化估计列，然后进行一些字符串替换和连接。最后把他们拉回来。在

df1 = df.melt(id_vars=['Year', 'Quarter', 'Ticker','Metric'],
        value_vars=['Estimate 1', 'Estimate 2', 'Estimate 3', 'Estimate 4'])

df1['variable'] = df1.variable.str.replace('Estimate ', '')
df1['Metric'] = df1['Metric'] + '_' + df1['variable']
df1.pivot_table(index=['Year', 'Quarter', 'Ticker'], columns='Metric', values='value').reset_index()

输出

^{pr2}$

网友

2楼 · 编辑于 2024-05-14 08:38:04

我不认为你需要使用for循环，你可以使用熊猫重塑：

df_out = df.set_index(['Year','Quarter','Sector','Ticker','Metric']).unstack()
df_out.columns = df_out.columns.get_level_values(1)+'_'+df_out.columns.get_level_values(0).str.split(' ').str[1]
df_out.reset_index()

输出：

^{pr2}$

如何为所有列创建和合并透视表创建For循环？在

相关问题更多 >

编程相关推荐

热门问题

热门文章