回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我想使用透视表来汇总数据集,然后能够像访问数据帧一样访问透视表中的信息。</p>
<p>考虑一个分层数据集,其中患者在医院接受治疗,医院位于以下区域:</p>
<pre><code>import pandas as pd
example_data = {'patient' : ['p1','p2','p3','p4','p5','p6','p7','p8','p9','p10','p11','p12','p13','p14','p15','p16','p17','p18','p19','p20','p21','p22','p23','p24','p25','p26','p27','p28','p29','p30','p31','p32','p33','p34','p35','p36','p37','p38','p39','p40','p41','p42','p43','p44','p45','p46','p47','p48','p49','p50','p51','p52','p53','p54','p55','p56','p57','p58','p59','p60','p61','p62','p63'],
'hospital' : ['h1','h1','h1','h2','h2','h2','h2','h3','h3','h3','h3','h3','h4','h4','h4','h4','h4','h4','h5','h5','h5','h5','h5','h5','h5','h6','h6','h6','h6','h6','h6','h6','h6','h7','h7','h7','h7','h7','h7','h7','h7','h7','h8','h8','h8','h8','h8','h8','h8','h8','h8','h8','h9','h9','h9','h9','h9','h9','h9','h9','h9','h9','h9'],
'region' : ['r1','r1','r1','r1','r1','r1','r1','r1','r1','r1','r1','r1','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r2','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3','r3'] }
example_dataframe = pd.DataFrame(example_data)
print example_dataframe
</code></pre>
<p>这将产生如下简单输出:</p>
<pre><code> hospital patient region
0 h1 p1 r1
1 h1 p2 r1
2 h1 p3 r1
3 h2 p4 r1
4 h2 p5 r1
5 h2 p6 r1
6 h2 p7 r1
7 h3 p8 r1
8 h3 p9 r1
9 h3 p10 r1
10 h3 p11 r1
11 h3 p12 r1
12 h4 p13 r2
13 h4 p14 r2
14 h4 p15 r2
15 h4 p16 r2
16 h4 p17 r2
etc.
</code></pre>
<p>现在我想用一个透视表来总结一下,简单地计算一下每家医院的病人数量:</p>
<pre><code>example_pivot_table = pd.pivot_table(example_dataframe, values='patient', rows=['hospital','region'], aggfunc='count')
print example_pivot_table
</code></pre>
<p>这将产生以下输出:</p>
<pre><code>hospital region
h1 r1 3
h2 r1 4
h3 r1 5
h4 r2 6
h5 r2 7
h6 r2 8
h7 r3 9
h8 r3 10
h9 r3 11
Name: patient, dtype: int64
</code></pre>
<p>据我所知,这实际上是一个多索引序列。</p>
<p>我怎样才能利用这些数据找出h7医院在哪个地区?如果<code>hospital</code>、<code>region</code>和患者计数数据是一个数据帧中的单独列,那么就很容易了。但我认为医院和地区是指标。我试过很多东西,但都没能成功。</p>