数据帧筛选器/从列导出标量值

2024-06-17 15:36:03 发布

您现在位置:Python中文网/ 问答频道 /正文

我的数据位于CSV中,如下所示:

(m-M),err(m-M),D,Method,Refcode,Notes,SN Name,Redshift,H0,LMCModulus
28.96,0.20,6.190,SNII optical,2017ApJ...841..127M,EPM,SN 2013ej,,,
29.13,,6.700,SNII optical,2004A&A...427..453V,EPM,SN 2002ap,,,
29.29,,7.200,SNII optical,2006PASP..118..351V,,SN 2003gd,,,
29.94,0.54,9.730,SNII optical,2010ApJ...715..833O,"SCM, I",SN 2003gd,,,
29.98,0.28,9.910,SNII optical,2010ApJ...715..833O,"SCM, BVI",SN 2003gd,,,
29.98,0.55,9.910,SNII optical,2010ApJ...715..833O,"SCM, V",SN 2003gd,,,
29.99,0.42,9.950,SNII optical,2010ApJ...715..833O,"SCM, B",SN 2003gd,,,
30.01,0.07,10.000,SNII optical,2014AJ....148..107R,"V, photospheric magnitude method",SN 2013ej,,,
26.72,0.69,2.210,Tully-Fisher,1984A&AS...56..381B,B,,,103.00,
29.93,0.40,9.700,Tully-Fisher,1988NBGC.C....0000T,B,,,75.00,

我的代码是:

import pandas as pd,
from pandas import DataFrame

d = pd.read_csv('ngc0628_zid.csv')

d  # Whole of the CSV prints OK

d.loc[:, 'D':'Method']

sub_d = d.loc[d['Method'] == 'SNII optical']   # Filter for 'SNII Optical' only - OK
sub_d.loc[:, 'D':'Method']   # Just report columns 'D' and 'Method' - OK

maxColumn = sub_d.max(axis=0)
maxColumn     # Prints max of all values

minColumn = sub_d.min(axis=0)
minColumn     # Prints max of all values

meanColumn = sub_d.mean(axis=0)
meanColumn     # Prints mean of all values

问题:我找不到一种方法来选择just'D'列来处理mean、max、min,而不产生语法错误。在每种情况下,我只能得到一个值表,而不是我需要的3个标量


Tags: ofcsvscmokallmeanprintsmethod
3条回答

IIUC

import pandas as pd
import numpy as np

from io import StringIO

csvfile = StringIO("""(m-M),err(m-M),D,Method,Refcode,Notes,SN Name,Redshift,H0,LMCModulus
28.96,0.20,6.190,SNII optical,2017ApJ...841..127M,EPM,SN 2013ej,,,
29.13,,6.700,SNII optical,2004A&A...427..453V,EPM,SN 2002ap,,,
29.29,,7.200,SNII optical,2006PASP..118..351V,,SN 2003gd,,,
29.94,0.54,9.730,SNII optical,2010ApJ...715..833O,"SCM, I",SN 2003gd,,,
29.98,0.28,9.910,SNII optical,2010ApJ...715..833O,"SCM, BVI",SN 2003gd,,,
29.98,0.55,9.910,SNII optical,2010ApJ...715..833O,"SCM, V",SN 2003gd,,,
29.99,0.42,9.950,SNII optical,2010ApJ...715..833O,"SCM, B",SN 2003gd,,,
30.01,0.07,10.000,SNII optical,2014AJ....148..107R,"V, photospheric magnitude method",SN 2013ej,,,
26.72,0.69,2.210,Tully-Fisher,1984A&AS...56..381B,B,,,103.00,
29.93,0.40,9.700,Tully-Fisher,1988NBGC.C....0000T,B,,,75.00,""")

df = pd.read_csv(csvfile)

vmin, vmax, vmean, vmedian = df['D'].agg(['min', 'max', 'mean', 'median'])

print(vmin)
print(vmax)
print(vmean)
print(vmedian)

print(f'The min is {vmin}. The max is {vmax}. The mean is {vmean}. The median is {vmedian}.')

输出:

10.0
8.15
9.715
The min is 2.21. The max is 10.0. The mean is 8.15. The median is 9.715.

要执行任何类型的静态操作,我们可以简单地执行如下操作

maxColumn = d['D'].max()
maxColumn

您可以简单地通过写D['D']或D.D来选择D列。这是你想问的问题吗

相关问题 更多 >