<blockquote>
<p>Logical operators for boolean indexing in Pandas</p>
</blockquote>
<p>重要的是要认识到,不能在<code>pandas.Series</code>或<code>pandas.DataFrame</code>上使用任何Python<em>逻辑运算符</em>(<code>and</code>、<code>or</code>或<code>not</code>)(类似地,不能在具有多个元素的<code>numpy.array</code>上使用它们)。您不能使用它们的原因是,它们隐式地调用操作数上的<code>bool</code>,这会引发异常,因为这些数据结构决定了数组的布尔值是不明确的:</p>
<pre><code>>>> import numpy as np
>>> import pandas as pd
>>> arr = np.array([1,2,3])
>>> s = pd.Series([1,2,3])
>>> df = pd.DataFrame([1,2,3])
>>> bool(arr)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
>>> bool(s)
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
>>> bool(df)
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
</code></pre>
<p>我确实更广泛地讨论过这个问题。</p>
<h2>NumPys逻辑函数</h2>
<p>但是<a href="https://docs.scipy.org/doc/numpy/reference/routines.logic.html" rel="nofollow noreferrer">NumPy</a>提供了这些运算符的元素级操作等价物,作为可以在<code>numpy.array</code>、<code>pandas.Series</code>、<code>pandas.DataFrame</code>或任何其他(一致的)<code>numpy.array</code>子类上使用的函数:</p>
<ul>
<li><code>and</code>有<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_and.html#numpy.logical_and" rel="nofollow noreferrer">^{<cd13>}</a></li>
<li><code>or</code>有<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_or.html#numpy.logical_or" rel="nofollow noreferrer">^{<cd15>}</a></li>
<li><code>not</code>有<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_not.html#numpy.logical_not" rel="nofollow noreferrer">^{<cd17>}</a></li>
<li><a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.logical_xor.html#numpy.logical_xor" rel="nofollow noreferrer">^{<cd18>}</a>它没有Python等价物,但是是一个逻辑的<a href="https://en.wikipedia.org/wiki/XOR_gate" rel="nofollow noreferrer">"exclusive or"</a>操作</li>
</ul>
<p>因此,本质上,应该使用(假设<code>df1</code>和<code>df2</code>是pandas数据帧):</p>
<pre><code>np.logical_and(df1, df2)
np.logical_or(df1, df2)
np.logical_not(df1)
np.logical_xor(df1, df2)
</code></pre>
<h2>布尔值的位函数和位运算符</h2>
<p>但是,如果您有布尔型NumPy数组、pandas系列或pandas数据帧,则还可以使用<a href="https://docs.scipy.org/doc/numpy/reference/routines.bitwise.html#elementwise-bit-operations" rel="nofollow noreferrer">element-wise bitwise functions</a>(对于布尔型,它们与逻辑函数或至少应该是不可区分的):</p>
<ul>
<li>按位与:<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.bitwise_and.html#numpy.bitwise_and" rel="nofollow noreferrer">^{<cd21>}</a>或<code>&</code>运算符</li>
<li>按位或:<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.bitwise_or.html#numpy.bitwise_or" rel="nofollow noreferrer">^{<cd23>}</a>或<code>|</code>运算符</li>
<li>按位不:<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.invert.html#numpy.invert" rel="nofollow noreferrer">^{<cd25>}</a>(或别名<code>np.bitwise_not</code>)或<code>~</code>运算符</li>
<li>按位异或:<a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.bitwise_xor.html#numpy.bitwise_xor" rel="nofollow noreferrer">^{<cd28>}</a>或<code>^</code>运算符</li>
</ul>
<p>通常使用运算符。但是,当与比较运算符组合时,必须记住将比较括在括号中,因为按位运算符有一个<a href="https://docs.python.org/reference/expressions.html#operator-precedence" rel="nofollow noreferrer">higher precedence than the comparison operators</a>:</p>
<pre><code>(df1 < 10) | (df2 > 10) # instead of the wrong df1 < 10 | df2 > 10
</code></pre>
<p>这可能会让人恼火,因为Python逻辑运算符的优先级比比较运算符低,所以通常编写<code>a < 10 and b > 10</code>(其中<code>a</code>和<code>b</code>是简单整数)而不需要括号。</p>
<h2>逻辑操作和按位操作之间的差异(在非布尔操作上)</h2>
<p>必须强调的是,位和逻辑操作只对布尔NumPy数组(以及布尔序列和数据帧)等效。如果它们不包含布尔值,则操作将给出不同的结果。我将包括使用NumPy数组的示例,但对于pandas数据结构,结果将类似:</p>
<pre><code>>>> import numpy as np
>>> a1 = np.array([0, 0, 1, 1])
>>> a2 = np.array([0, 1, 0, 1])
>>> np.logical_and(a1, a2)
array([False, False, False, True])
>>> np.bitwise_and(a1, a2)
array([0, 0, 0, 1], dtype=int32)
</code></pre>
<p>由于NumPy(和类似的pandas)对boolean(<a href="https://docs.scipy.org/doc/numpy/user/basics.indexing.html#boolean-or-mask-index-arrays" rel="nofollow noreferrer">Boolean or “mask” index arrays</a>)和integer(<a href="https://docs.scipy.org/doc/numpy/user/basics.indexing.html#index-arrays" rel="nofollow noreferrer">Index arrays</a>)索引做了不同的事情,因此索引的结果也将不同:</p>
<pre><code>>>> a3 = np.array([1, 2, 3, 4])
>>> a3[np.logical_and(a1, a2)]
array([4])
>>> a3[np.bitwise_and(a1, a2)]
array([1, 1, 1, 2])
</code></pre>
<h2>汇总表</h2>
<pre class="lang-none prettyprint-override"><code>Logical operator | NumPy logical function | NumPy bitwise function | Bitwise operator
-------------------------------------------------------------------------------------
and | np.logical_and | np.bitwise_and | &
-------------------------------------------------------------------------------------
or | np.logical_or | np.bitwise_or | |
-------------------------------------------------------------------------------------
| np.logical_xor | np.bitwise_xor | ^
-------------------------------------------------------------------------------------
not | np.logical_not | np.invert | ~
</code></pre>
<p>其中,逻辑运算符不适用于NumPy数组、pandas系列和pandas数据帧。其他的则处理这些数据结构(和普通的Python对象)和工作元素。
但是,在普通Python <code>bool</code>s上按位反转时要小心,因为bool在这个上下文中将被解释为整数(例如<code>~False</code>返回<code>-1</code>,而<code>~True</code>返回<code>-2</code>)。</p>