擅长:python、mysql、java
<p>IIUC这将是相当的(10%,因为您的所有样本行<;75%):</p>
<pre><code>In [15]: df.frequency.sum()
Out[15]: 2526
In [16]: df.frequency / df.frequency.sum() < 0.1
Out[16]:
0 True
1 True
2 False
3 False
4 False
5 True
Name: frequency, dtype: bool
In [17]: df.loc[df.frequency / df.frequency.sum() < .1]
Out[17]:
CategoryCount frequency
0 0 123
1 12 234
5 0 145
In [18]: len(df.loc[df.frequency / df.frequency.sum() < .1])
Out[18]: 3
</code></pre>
<p>或者更好一点<a href="https://stackoverflow.com/questions/40471490/select-stament-with-division-operator-mysql-using-pandas-dataframe/40471673?noredirect=1#comment68187856_40471673">variant from @John Galt</a>:</p>
<pre><code>In [19]: (df.frequency < df.frequency.sum() * 0.1 ).sum()
Out[19]: 3
</code></pre>
<p>OP在SQL中的查询:</p>
<p><a href="https://i.stack.imgur.com/8htHq.jpg" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/8htHq.jpg" alt="enter image description here"/></a></p>