pandas数据框中返回inf的列的mean（）：如何解决这个问题？

bcw = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data', header=None) for col in bcw.columns: if bcw[col].dtype != 'int64': print "Removendo possivel '?' na coluna %s..." % col bcw = bcw[bcw[col] != '?'] valores = bcw.iloc[:,1:10] #mean return inf print valores.iloc[:,5].mean()

3条回答

网友

1楼 · 编辑于 2024-04-25 20:17:27

如果pandas系列的元素是字符串，则得到inf和平均结果。在这种特定情况下，您只需将pandas系列元素转换为float，然后计算平均值。不需要使用numpy。

示例：

valores.iloc[:,5].astype(float).mean()

网友

2楼 · 编辑于 2024-04-25 20:17:27

在计算pandas.Series的平均值时，NaN值应该无关紧要。精确性也无关紧要。我能想到的唯一解释是valores中的一个值等于无穷大。

当计算如下平均值时，可以排除任何无穷大的值：

import numpy as np

is_inf = valores.iloc[:, 5] == np.inf
valores.ix[~is_inf, 5].mean()

网友

3楼 · 编辑于 2024-04-25 20:17:27

不太熟悉熊猫，但如果你转换成一个numpy数组，它会工作，尝试

np.asarray(valores.iloc[:,5], dtype=np.float).mean()

相关问题更多 >

编程相关推荐

热门问题

热门文章