如何基于多个条件依次计算Dataframe中的列

2024-05-16 01:55:21 发布

您现在位置:Python中文网/ 问答频道 /正文

如果我有这样一个数据集:

Object1    Object2    Outcome1  Outcome2      Total_TestsObject1 Total_TestsObject2   TotalOutcomes_Object1    TotalOutcomesObject2
  A         C            0         1                    0                 0                      0                      0
  D         A            0         1                    0                 1                      0                      0
  B         E            1         0                    0                 0                      0                      0
  F         B            0         1                    0                 1                      0                      1 
  A         B            1         0                    2                 2                      1                      2  
  A         C            0         1                    3                 1                      1                      1
  D         A            0         1                    1                 4                      0                      1
  B         E            1         0                    3                 1                      1                      0
  F         B            0         1                    1                 4                      0                      2 
  A         B            1         0                    5                 5                      3                      3  

我要做的是创建两列:

列1跟踪Object1中的项,并将它以前的所有唯一对相加,如下所示。totaloutcouts\u ObjectX/Total\u TestsObjectX 由于Object1中的项以前可能在Object2中,求和必须在两个对象中找到唯一的对

我知道这一点还不清楚,但仍在努力找到一种正确表述的方法。但这里有一个例子

在最后一排我们有一个&;B类

A以前最接近事件的唯一对

A   D   row 7
A   c   row 6

B的前几对最接近该事件

B    F  row 9
B    E  row 8

在找到这些对之后,求和如下

Column 1(Object1):    Sum: (D_Outcomes + C_outcomes) / (D_TotalTest + C_TotalTest)
             = (1 + 1)/(1 + 1)=1


Column 2(Object2):    Sum: (F_Outcomes + E_outcomes) / (F_TotalTest + E_TotalTest)
             = (1 + 1)/(1 + 1)=1

所以这个过程遍历每一行,找到所有之前的唯一对,然后进行求和

我不知道用python怎么做

 Input

    {
    Object1:[ A, A, B, C, A, F, B]
    Object2:[C, D, E, D, B, E, D]
    Outcome1:[1, 0, 1, 0, 0, 1, 0]
    Outcome2:[0, 1, 0, 1, 1, 0, 1]
    TotalOutcomes_Object1:[0, 0, 0, 1, 1, 0, 0]
    TotalOutcomes_Object2:[0, 0, 0, 0, 0, 1, 0]
    Total_TestsObject1:[0, 1, 0, 1, 2, 0, 2]
    Total_TestsObject2:[0, 0, 0, 1, 1, 1, 2]
    }

条件:

由于对象可以同时作为Object1和Object2,求和必须在Object1和Object2中找到对象的唯一外观;对象2

由于总的结果和总的测试都是在产生结果和进行测试时计算的,所以必须找到所有唯一的对象对,并返回它们的结果和最新的总的测试

Example:

Let's say A(Object1) and B(Object2) are a pair
A has previously been a pair with D, E and F
B has previously been a pair with C, D, G

Column1 then sums all of A's previous pair partners, meaning D, E, F
in the following way.

Sum: (D_Outcomes + E_outcomes + F_outcomes) / (C_TotalTest + D_TotalTest + F_outcomes)

Column2 then sums all of B's previous pair partners, meaning C,D, G
in the following way.

Sum: (C_Outcomes + D_outcomes + G_outcomes) / (C_TotalTest + D_TotalTest + G_outcomes)

The importance in this summations is that the pair partners latest stats are used, this is because both TotalOutcomes and TotalTest are counted up as they occur. 

Tags: and对象inarerowtotalsumpair