创建数据帧映射数组列表

2条回答

网友

1楼 · 编辑于 2024-04-26 00:33:36

其中：

input_vectors = pd.DataFrame({'vectors':[[['D', .5],['E',.3]],
                                         [['A',.3]],
                                         [['B',.8],['C',.5],['H',.2]]]})
input_vectors

输出：

                          vectors
0            [[D, 0.5], [E, 0.3]]
1                      [[A, 0.3]]
2  [[B, 0.8], [C, 0.5], [H, 0.2]]

以及

df_input

输出：

   index col1 col2
0      0    A    B
1      1    B    H
2      2    C    D

用途：

pd.concat([pd.DataFrame(x, index=[i]*len(x)) 
            for i, x in input_vectors.itertuples()])\
  .join(df_input)

输出：

   0    1  index col1 col2
0  D  0.5      0    A    B
0  E  0.3      0    A    B
1  A  0.3      1    B    H
2  B  0.8      2    C    D
2  C  0.5      2    C    D
2  H  0.2      2    C    D

网友

2楼 · 编辑于 2024-04-26 00:33:36

使用堆栈函数将列表列表拆分为行。然后对vectors列中的每一行，将其转换为字符串，并使用split函数创建两列va1和va2。使用concat通过索引列连接两个数据帧。删除列索引，因为在最终输出中不需要它。你知道吗

import pandas as pd
my_dict = {'index':[0,1,2], 'col1':['A','B','C'], 'col2':['B','H','D']}
df_input = pd.DataFrame(my_dict)
my_dict = {'index':[0,1,2],'vectors':[[['D', 0.5],['E', 0.3]],[['A', 0.3]],[['B', 0.8],['C', 0.5],['H', 0.2]]]}
df_output = pd.DataFrame(my_dict)

df_output = df_output.vectors.apply(pd.Series).stack().rename('vectors')
df_output = df_output.to_frame().reset_index(1, drop=True).reset_index()
df_tmp = df_output.vectors.apply(lambda x: ','.join(map(str, x))).str.split(',', expand=True)
df_tmp.columns = ['va1','val2']
df_tmp = pd.concat([df_tmp, df_output['index']], axis=1, sort=False)
df_tmp = df_input.join(df_tmp.set_index('index'), on='index')
df_tmp.reset_index(drop=True).drop(columns=['index'])

结果：

  col1 col2 va1 val2
0   A   B   D   0.5
1   A   B   E   0.3
2   B   H   A   0.3
3   C   D   B   0.8
4   C   D   C   0.5
5   C   D   H   0.2

相关问题更多 >

编程相关推荐

热门问题

热门文章

创建数据帧映射数组列表

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >