合并以相同字母开头的pandas DataFrame列

网友

1楼 · 编辑于 2024-04-26 14:13:40

我建议melt，然后是{}。要解决重复项，需要以cumcounted列为轴心。在

u = df.melt()
u['variable'] = u['variable'].str[0]  # extract the first letter
u.assign(count=u.groupby('variable').cumcount()).pivot('count', 'variable', 'value')

variable    a    b    c
count                  
0         1.0  5.0  9.0
1         2.0  6.0  0.0
2         3.0  7.0  NaN
3         4.0  8.0  NaN

可以改写为

^{pr2}$

如果性能很重要，可以使用pd.concat替代：

from operator import itemgetter

pd.concat({
    k: pd.Series(g.values.ravel()) 
    for k, g in df.groupby(operator.itemgetter(0), axis=1)
}, axis=1)

   a  b    c
0  1  5  9.0
1  3  7  0.0
2  2  6  NaN
3  4  8  NaN

网友

2楼 · 编辑于 2024-04-26 14:13:40

我们可以尝试groupby列（axis=1）：

def f(g,a):
    ret = g.stack().reset_index(drop=True)
    ret.name = a
    return ret

pd.concat( (f(g,a) for a,g in df.groupby(df.columns.str[0], axis=1)), axis=1)

输出：

^{pr2}$

网友

3楼 · 编辑于 2024-04-26 14:13:40

使用字典理解：

df = pd.DataFrame({i: pd.Series(x.to_numpy().ravel()) 
                      for i, x in df.groupby(lambda x: x[0], axis=1)})
print (df)
   a  b    c
0  1  5  9.0
1  3  7  0.0
2  2  6  NaN
3  4  8  NaN

相关问题更多 >

编程相关推荐

热门问题

热门文章

合并以相同字母开头的pandas DataFrame列

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >