pandas 稀疏数据框 value_counts 无法工作

1 投票
1 回答
570 浏览
提问于 2025-04-18 00:08

我在使用pandas的稀疏数据框时,遇到了一个类型错误,特别是在使用value_counts这个方法的时候。我已经列出了我正在使用的包的版本。

有没有什么建议可以让我解决这个问题呢?

提前谢谢你们。如果需要更多信息,请告诉我。

Python 2.7.6 |Anaconda 1.9.1 (x86_64)| (default, Jan 10 2014, 11:23:15) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>> import pandas
>>> print pandas.__version__
0.13.1
>>> import numpy
>>> print numpy.__version__
1.8.0

>>> dense_df = pandas.DataFrame(numpy.zeros((10, 10))
                               ,columns=['x%d' % ix for ix in range(10)])
>>> dense_df['x5'] = [1.0, 0.0, 0.0, 1.0, 2.1, 3.0, 0.0, 0.0, 0.0, 0.0]
>>> print dense_df['x5'].value_counts()
0.0    6
1.0    2
3.0    1
2.1    1
dtype: int64

>>> sparse_df = dense_df.to_sparse(fill_value=0) # Tried fill_value=0.0 also
>>> print sparse_df.density
0.04

>>> print sparse_df['x5'].value_counts()
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "//anaconda/lib/python2.7/site-packages/pandas/core/series.py", line 1156, in     value_counts
    normalize=normalize, bins=bins)
 File "//anaconda/lib/python2.7/site-packages/pandas/core/algorithms.py", line 231, in value_counts
    values = com._ensure_object(values)
  File "generated.pyx", line 112, in pandas.algos.ensure_object (pandas/algos.c:38788)
  File "generated.pyx", line 117, in pandas.algos.ensure_object (pandas/algos.c:38695)
  File "//anaconda/lib/python2.7/site-packages/pandas/sparse/array.py", line 377, in astype
    raise TypeError('Can only support floating point data for now')
TypeError: Can only support floating point data for now

1 个回答

2

这个功能现在还没有实现,先把它转换为密集格式吧。

In [12]: sparse_df['x5'].to_dense().value_counts()
Out[12]: 
0.0    6
1.0    2
3.0    1
2.1    1
dtype: int64

撰写回答