如何在pandas中读取固定宽度格式的文本文件

2024-05-15 04:43:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我刚接触到熊猫,正在研究如何读取文件。该文件来自WRDS数据库,是SP500成分列表,一直到20世纪60年代。我检查了该文件,无论我如何使用“read_csv”导入它,我仍然无法正确显示数据。

df = read_csv('sp500-sb.txt')

df

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1231 entries, 0 to 1230
Data columns: gvkeyx      from      thru     conm
                                        gvkey      co_conm
...(the column names)
dtypes: object(1)

上面这段输出意味着什么?什么都有帮助


Tags: 文件csv数据txt数据库df列表read
3条回答

你说展示是什么意思?不是df['gvkey']给你gvkey列中的数据吗?

如果您要做的是将整个数据帧打印到控制台,那么请查看df.to_string(),但是如果列太多,则很难读取。如果列太多,Pandas默认不会打印全部内容:

import pandas
import numpy 

df1 = pandas.DataFrame(numpy.random.randn(10, 3), columns=['col%d' % d for d in range(3)] )
df2 = pandas.DataFrame(numpy.random.randn(10, 30), columns=['col%d' % d for d in range(30)] )

print df1   # <--- substitute by df2 to see the difference
print
print df1['col1']
print
print df1.to_string()

韦斯在一封电子邮件中回复了我。干杯。

This is a fixed-width-format file (not delimited by commas or tabs as usual). I realize that pandas does not have a fixed-width reader like R does, though one can be fashioned very easily. I'll see what I can do. In the meantime if you can export the data in another format (like csv--truly comma separated) you'll be able to read it with read_csv. I suspect with some unix magic you can transform a FWF file into a CSV file.

I recommend following the issue on github as your e-mail is about to disappear from my inbox :)

https://github.com/pydata/pandas/issues/920

best, Wes

在pandas 0.7.3(April 2012)中添加了pandas.read_fwf()来处理固定宽度的文件。

  1. API reference

  2. An example from other question

相关问题 更多 >

    热门问题