pandas 稀疏数据框 value_counts 无法工作
我在使用pandas的稀疏数据框时,遇到了一个类型错误,特别是在使用value_counts这个方法的时候。我已经列出了我正在使用的包的版本。
有没有什么建议可以让我解决这个问题呢?
提前谢谢你们。如果需要更多信息,请告诉我。
Python 2.7.6 |Anaconda 1.9.1 (x86_64)| (default, Jan 10 2014, 11:23:15)
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> print pandas.__version__
0.13.1
>>> import numpy
>>> print numpy.__version__
1.8.0
>>> dense_df = pandas.DataFrame(numpy.zeros((10, 10))
,columns=['x%d' % ix for ix in range(10)])
>>> dense_df['x5'] = [1.0, 0.0, 0.0, 1.0, 2.1, 3.0, 0.0, 0.0, 0.0, 0.0]
>>> print dense_df['x5'].value_counts()
0.0 6
1.0 2
3.0 1
2.1 1
dtype: int64
>>> sparse_df = dense_df.to_sparse(fill_value=0) # Tried fill_value=0.0 also
>>> print sparse_df.density
0.04
>>> print sparse_df['x5'].value_counts()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "//anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 1156, in value_counts
normalize=normalize, bins=bins)
File "//anaconda/lib/python2.7/site-packages/pandas/core/algorithms.py", line 231, in value_counts
values = com._ensure_object(values)
File "generated.pyx", line 112, in pandas.algos.ensure_object (pandas/algos.c:38788)
File "generated.pyx", line 117, in pandas.algos.ensure_object (pandas/algos.c:38695)
File "//anaconda/lib/python2.7/site-packages/pandas/sparse/array.py", line 377, in astype
raise TypeError('Can only support floating point data for now')
TypeError: Can only support floating point data for now
1 个回答
2
这个功能现在还没有实现,先把它转换为密集格式吧。
In [12]: sparse_df['x5'].to_dense().value_counts()
Out[12]:
0.0 6
1.0 2
3.0 1
2.1 1
dtype: int64