如何基于数据帧中的多个条件计算出现次数

2024-03-29 13:07:51 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图找出如何使用多个条件计算数据帧中出现的次数。 在这个特别的例子中,我想知道第3类中女性乘客的数量

    PassengerId Pclass  Sex Age SibSp   Parch   Ticket  Fare    Cabin   Embarked
0       892 3   male    34.5    0   0   330911  7.8292  NaN Q
1       893 3   female  47.0    1   0   363272  7.0000  NaN S
2       894 2   male    62.0    0   0   240276  9.6875  NaN Q
3       895 3   male    27.0    0   0   315154  8.6625  NaN S
4       896 3   female  22.0    1   1   3101298 12.2875 NaN S

以下是我几次失败的尝试:

    len(test[test["Sex"] == "female", test["Pclass"] == 3])
    sum(test.Pclass == 3 & test.Sex == "female")
    test.[test["Sex"] == "female", test["Pclass"] == 3].count()

他们似乎都没有工作。 最后,我创建了自己的函数,但必须有一种更简单的方法来计算它

def countif(sex, pclass):
    x = 0
    for i in range(0,len(test)):
        s = test.iloc[i]['Sex']
        c = test.iloc[i]['Pclass']
        if s == sex and c == pclass:
                x = x + 1
    return x

先谢谢你


Tags: 数据testlennan条件次数malefemale
1条回答
网友
1楼 · 发布于 2024-03-29 13:07:51

有几种方法可以做到这一点:

test = pd.DataFrame({'PassengerId': {0: 892, 1: 893, 2: 894, 3: 895, 4: 896}, 
      'Pclass': {0: 3, 1: 3, 2: 2, 3: 3, 4: 3}, 
      'Sex': {0: 'male', 1: 'female', 2: 'male', 3: 'male', 4: 'female'}, 
      'Age': {0: 34.5, 1: 47.0, 2: 62.0, 3: 27.0, 4: 22.0}, 
      'SibSp': {0: 0, 1: 1, 2: 0, 3: 0, 4: 1}, 
      'Parch': {0: 0, 1: 0, 2: 0, 3: 0, 4: 1}, 
      'Ticket': {0: 330911, 1: 363272, 2: 240276, 3: 315154, 4: 3101298}, 
      'Fare': {0: 7.8292, 1: 7.0, 2: 9.6875, 3: 8.6625, 4: 12.2875}, 
      'Cabin': {0: np.nan, 1: np.nan, 2: np.nan, 3: np.nan, 4: np.nan}, 
      'Embarked': {0: 'Q', 1: 'S', 2: 'Q', 3: 'S', 4: 'S'}})

您需要将布尔值放在圆括号中,并用&

sum((test.Pclass == 3) & (test.Sex == "female"))
len(test[(test.Pclass == 3) & (test.Sex == "female")])
test[(test["Sex"] == "female") & (test["Pclass"] == 3)].shape[0]

或者你可以:

tab = pd.crosstab(df.Pclass,df.Sex)

Sex female  male
Pclass      
2   0   1
3   2   2

tab.iloc[tab.index==3]['female']

相关问题 更多 >