将pandas数据帧转为前缀cols，而不是多索引

ts = pd.DataFrame([['Jan 2000','WidgetCo',0.5, 2], ['Jan 2000','GadgetCo',0.3, 3], ['Jan 2000','SnazzyCo',0.2, 4], ['Feb 2000','WidgetCo',0.4, 2], ['Feb 2000','GadgetCo',0.5, 2.5], ['Feb 2000','SnazzyCo',0.1, 4], ], columns=['month','company','share','price'])

share price company GadgetCo SnazzyCo WidgetCo GadgetCo SnazzyCo WidgetCo month Feb 2000 0.5 0.1 0.4 2.5 4 2 Jan 2000 0.3 0.2 0.5 3.0 4 2

WidgetCo_share WidgetCo_price GadgetCo_share GadgetCo_price ... month Jan 2000 0.5 2 0.3 3.0 Feb 2000 0.4 2 0.5 2.5

def pivot_table_to_flat(df, column, index): res = df.set_index(index) cols = res.drop(column, axis=1).columns.values resulting_cols = [] for prefix in res[column].unique(): for col in cols: new_col_name = prefix + '_' + col res[new_col_name] = res[res[column] == prefix][col] resulting_cols.append(new_col_name) return res[resulting_cols] pivot_table_to_flat(ts, index='month', column='company')

3条回答

网友

1楼 · 编辑于 2024-05-15 23:43:45

我想出来了。使用MultiIndex上的数据可以得到一个非常干净的解决方案：

def flatten_multi_index(df):
    mi = df.columns
    suffixes, prefixes = mi.levels
    col_names = [prefixes[i_p] + '_' + suffixes[i_s] for (i_s, i_p) in zip(*mi.labels)]
    df.columns = col_names
    return df

flatten_multi_index(pd.pivot_table(ts,index='month', columns='company'))

上面的版本只处理2DMultiIndex，但如果需要，可以将其通用化。在

网友

2楼 · 编辑于 2024-05-15 23:43:45

这似乎更简单：

df.columns = [' '.join(col).strip() for col in df.columns.values]

它使用一个带有多索引列的df，并将列标签展平，df保持不变。在

（参考：@andy hadenPython Pandas - How to flatten a hierarchical index in columns）

网友

3楼 · 编辑于 2024-05-15 23:43:45

更新（截至2017年初和熊猫0.19.2）。您可以在MultiIndex上使用.values。因此，这个片段应该为那些需要帮助的人压平MultiIndex。代码段太聪明了，但还不够聪明：它可以处理数据帧中的行索引或列名，但如果getattr(df,way)的结果没有嵌套（即MultiIndex），它就会爆炸。在

def flatten_multi(df, way='index'): # or way='columns'
    assert way in {'index', 'columns'}, "I'm sorry Dave."
    mi = getattr(df, way)
    flat_names = ["_".join(s) for s in mi.values]
    setattr(df, way, flat_names)
    return df

相关问题更多 >

编程相关推荐

热门问题

热门文章