保存DateTimeIndex的pandas数据格式

2024-05-23 18:27:48 发布

您现在位置:Python中文网/ 问答频道 /正文

我对具有日期时间索引和多索引的数据做了大量的工作。以.csv格式保存和读取是乏味的,因为每次我必须重置_索引并将其命名为“date”,然后当我再次读取时,我必须将日期转换回日期时间并设置索引。什么格式可以帮助我避免这种情况?我更喜欢开源的东西——例如,我认为SAS和Stata可以做到这一点,但它们是专有的。在


Tags: csv数据date格式时间情况开源命名
1条回答
网友
1楼 · 发布于 2024-05-23 18:27:48

羽毛是为这个做的: https://github.com/wesm/feather

Feather provides binary columnar serialization for data frames. It is designed to make reading and writing data frames efficient, and to make sharing data across data analysis languages easy. This initial version comes with bindings for python (written by Wes McKinney) and R (written by Hadley Wickham).

Feather uses the Apache Arrow columnar memory specification to represent binary data on disk. This makes read and write operations very fast. This is particularly important for encoding null/NA values and variable-length types like UTF8 strings.

Feather is a part of the broader Apache Arrow project. Feather defines its own simplified schemas and metadata for on-disk representation.

Feather currently supports the following column types:

A wide range of numeric types (int8, int16, int32, int64, uint8, uint16, uint32, uint64, float, double). Logical/boolean values. Dates, times, and timestamps. Factors/categorical variables that have fixed set of possible values. UTF-8 encoded strings. Arbitrary binary data.

相关问题 更多 >