<p>我认为你需要:</p>
<pre><code>hospProfiling.loc[hospProfiling.groupby(['Hospital_ID', 'District_ID'])['Hospital_employees']
.idxmax()]
</code></pre>
<p>我对另一个答案感到非常惊讶,我做了一些研究,如果函数<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.DataFrameGroupBy.idxmax.html" rel="nofollow">^{<cd1>}</a>是否无用:</p>
<p>样品:</p>
^{pr2}$
<p>主要区别在于如何处理另一列,如果使用<code>max</code>它将返回每列的最大值-这里是<code>Hospital_employees</code>和{<cd4>}:</p>
<pre><code>c_maxes = hospProfiling.groupby(['Hospital_ID','District_ID'],as_index = False).max()
print (c_maxes)
Hospital_ID District_ID Hospital_employees Name Val
0 A F 41 Annie 7
1 A M 56 Sam 200
2 B F 28 Julie 9
3 B M 70 James 20
c_maxes = hospProfiling.groupby(['Hospital_ID','District_ID'],as_index = False)
.agg({'Hospital_employees': max})
print (c_maxes)
Hospital_ID District_ID Hospital_employees
0 A F 41
1 A M 56
2 B F 28
3 B M 70
</code></pre>
<p>函数<code>idxmax</code>返回另一列中最大值的索引:</p>
<pre><code>print (hospProfiling.groupby(['Hospital_ID', 'District_ID'])['Hospital_employees'].idxmax())
A F 1
M 10
B F 11
M 2
Name: Hospital_employees, dtype: int64
</code></pre>
<p>然后您只需按<a href="http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html" rel="nofollow">^{<cd7>}</a>选择<code>DataFrame</code>:</p>
<pre><code>c_maxes = hospProfiling.loc[hospProfiling.groupby(['Hospital_ID', 'District_ID'])['Hospital_employees']
.idxmax()]
print (c_maxes)
District_ID Hospital_ID Hospital_employees Name Val
1 F A 41 Annie 7
10 M A 56 Alan 6
11 F B 28 Julie 9
2 M B 70 Fred 14
</code></pre>