回答此问题可获得 20 贡献值,回答如果被采纳可获得 50 分。
<p>我有下面的数据帧,我想包括所有信息的基础上“个人ID”后,条件(s)得到满足</p>
<pre><code>import pandas as pd
data = [['A-1', 'Birth','0'],
['A-1','Sickle cell',"5"],['A-1', 'Lung cancer',"25"],
['A-1','Death','35'],['A-2', 'Birth', '0'],
['A-2','Sarcoma','10'],['A-2', 'Melanoma','19'],
['A-2', 'Current Age', '20'], ['A-3', 'Birth',"0"],
['A-3','Sickle cell','25'],['A-3', "Skin cancer", "29"],
['A-3', "Current Age", '40']]
df = pd.DataFrame(data,columns=["Individual ID", "Diagnosis","Age"])
print df
</code></pre>
<p>我尝试了以下代码:</p>
<pre><code>first = pd.DataFrame(df.groupby("Individual ID").filter(lambda g: g["Individual ID"].size > 3))
breast1 = ((first["Repeat Instance"] == 1) & (first["Diagnosis"] != "Sickle cell"))
after = first[breast1]
print after
</code></pre>
<p>运行代码后,我得到以下结果:</p>
<pre><code> Individual ID Diagnosis Age Repeat Instance
1 A-1 Sickle cell 5 1
9 A-3 Sickle cell 25 1
</code></pre>
<p>我想得到个人A-1和A-3(出生,当前年龄,其他诊断)的其余信息,但还没有弄清楚</p>
<p>任何帮助都将不胜感激</p>