“”“
我有一个百万行的数据框,我在上面做了.groupby()
“”“
df = pd.DataFrame({'id': ['g1','g1','g1','g1','g2','g2','g2','g2','g2','g2'],\
'Trans':['g1.1','g1.2','g1.3','g1.4','g2.1','g2.2','g2.3','g2.2','g2.1','g2.1'],\
'Tissue': ['Lf','Lf','Lf','pc','Pol','Pol','Pol','Ant','Ant','m2'],\
'val': [0.0948,1.5749,1.8904,0.8673,2.1089,2.5058,4.5722,0.7626,3.1381,2.723]})
print('df')
df_highest = pd.DataFrame(columns=df.columns)#brand new df that will contain the rows of interest
for grpID,data in df.groupby(['id','Tissue']):
highest = data.nlargest(1,'val')
df_highest.append(highest)
df_highest.to_csv('out.txt',sep='\t',index=False)
如果您试图获得每个id和组织组合的最大值,请尝试以下代码
这将为您提供id和组织组合的平均值
相关问题 更多 >
编程相关推荐