Python pandas_streaming包_程序模块 - PyPI

熊猫流媒体业务。

pandas_streaming的Python项目详细描述

自述文件

https://circleci.com/gh/sdpython/pandas_streaming/tree/master.svg?style=svg

https://badge.fury.io/py/pandas_streaming.svg

https://codecov.io/github/sdpython/pandas_streaming/coverage.svg?branch=master

https://api.codacy.com/project/badge/Grade/f53b7f4d6a0447aa9ce0c4ad5df659ef

pandas_streaming 目的是用pandas处理大文件，太大而无法保存在内存中，太小而无法与显著的增益并行。模块复制pandasapi的子集并实现机器学习的其他功能。

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_csv("filename", sep="\t", encoding="utf-8")

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

模块还可以流出一个现有的数据文件。

import pandas
df = pandas.DataFrame([dict(cf=0, cint=0, cstr="0"),
                       dict(cf=1, cint=1, cstr="1"),
                       dict(cf=3, cint=3, cstr="3")])

from pandas_streaming.df import StreamingDataFrame
sdf = StreamingDataFrame.read_df(df)

for df in sdf:
    # process this chunk of data
    # df is a dataframe
    print(df)

链接：

历史

当前-2018-05-17-0.00MB

6：添加Pyensae（2018-05-17）的Pandas_Groupby_nan

0.1.66-2018-02-05-0.02MB

5：将随机状态参数添加到拆分函数（2018-02-04）
2：添加方法样本，保留样本（2017-11-05）
3：内存不足数据集的方法列测试分割（2017-10-21）
1：为您的项目感到兴奋（2017-10-10）

欢迎加入QQ群-->： 979659372

pandas_streaming 0.1.87

pandas_streaming的Python项目详细描述

自述文件

历史

当前-2018-05-17-0.00MB

0.1.66-2018-02-05-0.02MB

推荐PyPI第三方库

bce-sdk

xswitch

cloudshell-tg-teravm

waitGPU

necroplankton

cloudshell-cli

gohints

peek-plugin-gis-diagram

azuremgmtiothub

torethink

easylife

MDsrv

snsg

nondjango-storages

indico-plugin-vc-vidyo

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

pandas_streaming 0.1.87

pandas_streaming的Python项目详细描述

自述文件

历史

当前-2018-05-17-0.00MB

0.1.66-2018-02-05-0.02MB

推荐PyPI第三方库

bce-sdk

xswitch

cloudshell-tg-teravm

waitGPU

necroplankton

cloudshell-cli

gohints

peek-plugin-gis-diagram

azuremgmtiothub

torethink

easylife

MDsrv

snsg

nondjango-storages

indico-plugin-vc-vidyo

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

导航栏

项目链接

标签