如果问题不完全清楚,道歉。但是,我确实有一些示例代码显示了所需的输入和输出(见下文)
我有一个(大)数据帧,希望选择pval1中的最小值和相应的滞后。我还想选择pval2中的最小值和相应的滞后。我想对每一对变量(即,(A和B),(A和C)和(B和D))都这样做。每对变量在数据集中出现多次
我尝试了几种方法来尝试获得我想要的输出,但似乎遗漏了一些逻辑方面的东西,我不太确定是什么。任何帮助都将不胜感激
感谢所有帮助你的人
数据帧的外观如下所示:
myxdf = pd.DataFrame({
'pval1': [0.01,0.2,0.001,0.3,0.0003,0.05,1,0.002,0.2],
'pval2': [0.3,0.02,0.002,0.9,0.001,0.002,0.10,0.93,0.00001],
'lag': [1,2,3,1,2,3,1,2,3],
'var1': ['A','A','A','A','A','A','B','B','B'],
'var2': ['B','B','B','C','C','C','D','D','D']
})
myxdf
上述示例的理想输出应该如下所示(请注意新的lag1和lag2列):
myxdf2 = pd.DataFrame({
'pval1': [0.0010,0.0003,0.002],
'pval2' : [0.002,0.001,0.00001],
'lagp1': ['3','2','2'],
'lagp2': ['3','2','3'],
'var1': ['A','A','B'],
'var2': ['B','C','D']
})
myxdf2
我相信您需要^{} 作为最小值的索引,将其用于选择行、重命名列和通过^{} 连接:
相关问题 更多 >
编程相关推荐