一列中的数字与另一列中的文本相等时的计数频率

Year Course Modul Q1 Q2 2015 Physics CS1203 4 2 2015 Physics CS1203 4 3 2015 Physics CS1203 3 1 2015 Physics CS1203 4 4 2015 English IR0001 2 5 2015 English IR0001 1 2 2015 English IR0001 3 1 2015 English IR0001 5 3 2015 English IR0001 4 3

3条回答

网友

1楼 · 编辑于 2024-04-25 05:52:22

也许你可以试试这样的方法：

df.groupby(['module','q1'])['module'].agg({'Frequency':'count'})

请参考post。你知道吗

网友

2楼 · 编辑于 2024-04-25 05:52:22

您可以首先按模块（df.module == 'CS1203'）筛选DF，然后只筛选那些匹配q\d+RegEx的列，只选择4s，最后计算总和：

In [74]: (df[df.module == 'CS1203'].filter(regex=r'q\d+') == 4).sum()
Out[74]:
q1    3
q2    1
dtype: int64

网友

3楼 · 编辑于 2024-04-25 05:52:22

我想你需要^{}：

print (df[(df.module == 'CS1203') & (df.q1 == 4)])
   year   course  module  q1  q2
0  2015  Physics  CS1203   4   2
1  2015  Physics  CS1203   4   3
3  2015  Physics  CS1203   4   4

print (len(df[(df.module == 'CS1203') & (df.q1 == 4)]))
3

如果需要在所有q列中计数，请首先使用^{}：

df = pd.melt(df, id_vars=['year','course','module'], value_name='q')
   year   course  module  q1  q2
0  2015  Physics  CS1203   4   2
1  2015  Physics  CS1203   4   3
2  2015  Physics  CS1203   3   1
3  2015  Physics  CS1203   4   4
4  2015  English  IR0001   2   5
5  2015  English  IR0001   1   2
6  2015  English  IR0001   3   1
7  2015  English  IR0001   5   3
8  2015  English  IR0001   4   3

print (df[(df.module == 'CS1203') & (df.q == 4)])
    year   course  module variable  q
0   2015  Physics  CS1203       q1  4
1   2015  Physics  CS1203       q1  4
3   2015  Physics  CS1203       q1  4
12  2015  Physics  CS1203       q2  4

print (len(df[(df.module == 'CS1203') & (df.q == 4)]))
4

相关问题更多 >

编程相关推荐

热门问题

热门文章

一列中的数字与另一列中的文本相等时的计数频率

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >