Pandas通过lis添加两个或多个不同数据帧的值

2024-06-01 05:00:01 发布

您现在位置:Python中文网/ 问答频道 /正文

我希望通过列表在三个或更多数据帧之间添加值,而不是逐个添加

首先,我将使用merge作为示例

下一行逐个合并数据帧(data0data1data2):

final_data = data0.merge(data1, on=['player_id', 'player_name'])
final_data = final_data.merge(data2, on=['player_id', 'player_name'])

但是,我可以通过列表合并数据帧,这在处理更多DF时非常有帮助,例如:

data_list = [data0, data1, data2]
final_data = reduce(lambda left, right: pd.merge(left, right, on=['player_id', 'player_name']), data_list)

现在,我有以下三个数据帧,我想在它们之间添加值

data0

    player_id  player_name  ab  run  hit
0       28920     S. Smith   0    0    0
1       33351   T. Mancini   0    0    0
2       30267    C. Gentry   0    0    0
3       28513     A. Jones   0    0    0
4       31097   M. Machado   0    0    0
5       29170     C. Davis   0    0    0
6       29322    M. Trumbo   0    0    0
7       29564  W. Castillo   0    0    0
8       34885       H. Kim   0    0    0
9       32952   J. Rickard   0    0    0
10      31988    J. Schoop   0    0    0
11       5908   J.J. Hardy   0    0    0

其次,

data1

   player_id player_name  ab  run  hit
0      28920    S. Smith   1    4    6
1      33351  T. Mancini   0    0    2
2      28513    A. Jones   2    1    0
3      31097  M. Machado   1    8    0
4      34885      H. Kim   1    1    2
5      32952  J. Rickard   0    2    0
6      31988   J. Schoop   5    3    4
7       5908  J.J. Hardy   4    2   10

接下来

data2

   player_id player_name  ab  run  hit
0      28920    S. Smith   1    9    2
1      31097  M. Machado   3    3    3
2      29170    C. Davis   9    6    4
3      29322   M. Trumbo   3    5    7
4      32952  J. Rickard   1    3    4
5       5908  J.J. Hardy   0    0    5

我希望得到的最终数据帧应该如下所示:

final_data

    player_id  player_name  ab  run  hit
0       28920     S. Smith   2   13    8
1       33351   T. Mancini   0    0    2
2       30267    C. Gentry   0    0    0
3       28513     A. Jones   2    1    0
4       31097   M. Machado   4   11    3
5       29170     C. Davis   9    6    4
6       29322    M. Trumbo   3    5    7
7       29564  W. Castillo   0    0    0
8       34885       H. Kim   1    1    2
9       32952   J. Rickard   1    5    4
10      31988    J. Schoop   5    3    4
11       5908   J.J. Hardy   4    2   15

我可以通过下面的代码得到结果,但这会一个接一个地添加数据帧

data0 = pd.read_csv('initial_df.csv')
data1 = pd.read_csv('add_vals1.csv')
data2 = pd.read_csv('add_vals2.csv')


data0 = data0.set_index(['player_id', 'player_name'])
data1 = data1.set_index(['player_id', 'player_name'])
data2 = data2.set_index(['player_id', 'player_name'])

final_data = data0.add(data1, fill_value=0).astype(int).reset_index()
final_data = final_data.set_index(['player_id', 'player_name'])
final_data = final_data.add(data2, fill_value=0).astype(int).reset_index()

有谁能帮我通过列表获得最终结果,就像我在上面使用合并功能一样?非常感谢你


Tags: csv数据runnameiddataindexab
1条回答
网友
1楼 · 发布于 2024-06-01 05:00:01

我认为需要在{}中为{}使用参数{},然后在{}中使用{{a1}:

from functools import reduce

data0 = pd.read_csv('initial_df.csv', index_col=['player_id', 'player_name'])
data1 = pd.read_csv('add_vals1.csv', index_col=['player_id', 'player_name'])
data2 = pd.read_csv('add_vals2.csv', index_col=['player_id', 'player_name'])
data_list = [data0, data1, data2]
final_data = reduce(lambda x, y: x.add(y, fill_value=0), data_list).reset_index()
print (final_data)
    player_id  player_name   ab   run   hit
0        5908   J.J. Hardy  4.0   2.0  15.0
1       28513     A. Jones  2.0   1.0   0.0
2       28920     S. Smith  2.0  13.0   8.0
3       29170     C. Davis  9.0   6.0   4.0
4       29322    M. Trumbo  3.0   5.0   7.0
5       29564  W. Castillo  0.0   0.0   0.0
6       30267    C. Gentry  0.0   0.0   0.0
7       31097   M. Machado  4.0  11.0   3.0
8       31988    J. Schoop  5.0   3.0   4.0
9       32952   J. Rickard  1.0   5.0   4.0
10      33351   T. Mancini  0.0   0.0   2.0
11      34885       H. Kim  1.0   1.0   2.0

另一种解决方案是^{}^{}通过两个级别:

data_list = [data0, data1, data2]
final_data = pd.concat(data_list).sum(level=[0,1]).reset_index()
print (final_data)
    player_id  player_name  ab  run  hit
0       28920     S. Smith   2   13    8
1       33351   T. Mancini   0    0    2
2       30267    C. Gentry   0    0    0
3       28513     A. Jones   2    1    0
4       31097   M. Machado   4   11    3
5       29170     C. Davis   9    6    4
6       29322    M. Trumbo   3    5    7
7       29564  W. Castillo   0    0    0
8       34885       H. Kim   1    1    2
9       32952   J. Rickard   1    5    4
10      31988    J. Schoop   5    3    4
11       5908   J.J. Hardy   4    2   15

相关问题 更多 >