Python indexed-gzip包_程序模块 - PyPI

python中gzip文件的快速随机访问

indexed-gzip的Python项目详细描述

索引的

在python中快速随机访问gzip文件
Overview
Installation
Usage
Using with ^{}
Index import/export
Write support
Performance
Acknowledgements
License

概述

indexed_gzip项目是一个python扩展，旨在提供内置pythongzip.GzipFile类的替换，即 IndexedGzipFile。

indexed_gzip是为了允许快速随机访问压缩的 NIFTI图像文件（其中gzip是事实上的压缩标准），但可以与任何gzip文件一起使用。 indexed_gzip很容易与nibabel（http://nipy.org/nibabel/）一起使用。

标准的gzip.GzipFile类公开一个类似随机访问的接口（通过它的seek和read方法），但是每次您在未压缩的数据流，GzipFile实例必须开始解压缩从文件的开头，直到它到达请求的位置。

一个IndexedGzipFile实例通过建立一个索引，其中包含seek points，映射压缩和未压缩数据流中的相应位置。每个搜索点附带一块（32kb）未压缩的数据，该数据用于初始化解压算法，允许我们从任何搜索点。如果索引是以1MB的搜索点间距构建的，则只需（平均）解压缩512kb数据即可从任何位置读取在档案里。

安装

indexed_gzip在PyPi-to上可用安装，只需键入：

pip install indexed_gzip

您还可以从conda forge安装indexed_gzip：

conda install -c conda-forge indexed_gzip

要编译indexed_gzip，请确保有cython 已安装（如果要编译测试，numpy），然后运行：

python setup.py develop

要运行测试，请键入以下内容；您需要numpy和pytest 已安装：

pytest

用法

您可以直接使用indexed_gzip模块：

importindexed_gzipasigzip# You can create an IndexedGzipFile instance# by specifying a file name, or an open file# handle. For the latter use, the file handle# must be opened in read-only binary mode.# Write support is currently non-existent.myfile=igzip.IndexedGzipFile('big_file.gz')some_offset_into_uncompressed_data=234195# The index will be automatically# built on-demand when seeking or# reading.myfile.seek(some_offset_into_uncompressed_data)data=myfile.read(1048576)

与`nibabel`

一起使用

可以将indexed_gzip与nibabel一起使用。nibabel>；=2.3.0将自动使用indexed_gzip如果存在：

importnibabelasnibimage=nib.load('big_image.nii.gz')

如果使用nibabel2.2.x，则需要显式设置keep_file_open 标志：

importnibabelasnibimage=nib.load('big_image.nii.gz',keep_file_open='auto')

要将indexed_gzip与nibabel2.1.0或更早版本一起使用，您需要做一点更多工作：

importnibabelasnibimportindexed_gzipasigzip# Here we are using 4MB spacing between# seek points, and using a larger read# buffer (than the default size of 16KB).fobj=igzip.IndexedGzipFile(filename='big_image.nii.gz',spacing=4194304,readbuf_size=131072)# Create a nibabel image using# the existing file handle.fmap=nib.Nifti1Image.make_file_map()fmap['image'].fileobj=fobjimage=nib.Nifti1Image.from_file_map(fmap)# Use the image ArrayProxy to access the# data - the index will automatically be# built as data is accessed.vol3=image.dataobj[:,:,:,3]

索引导入/导出

如果您有一个大文件，您可能希望预生成一次索引，并且保存到索引文件：

importindexed_gzipasigzip# Load the file, pre-generate the# index, and save it out to disk.fobj=igzip.IndexedGzipFile('big_file.gz')fobj.build_full_index()fobj.export_index('big_file.gzidx')

下次打开同一文件时，可以加载到索引中：

importindexed_gipasigzipfobj=igzip.IndexedGzipFile('big_file.gz',index_file='big_file.gzidx')

写入支持

indexed_gzip当前不支持写入。目前如果你如果要写入文件，则需要通过其他方式保存文件（例如通过gzip或nibabel），然后重新创建一个新的IndexedGzipFile实例。例如：

importnibabelasnib# Load the entire image into memoryimage=nib.load('big_image.nii.gz')data=image.get_data()# Make changes to the datadata[:,:,:,5]*=100# Save the image using nibabelnib.save(data,'big_image.nii.gz')# Re-load the imageimage=nib.load('big_image.nii.gz')

性能

一个小的test script包含在 indexed_gzip；此脚本比较IndexedGzipFile的性能用gzip.GzipFile类初始化。此脚本执行以下操作：

生成测试文件。
生成指定数量的均匀间隔的搜索位置整个测试文件。
随机移动这些位置
查找每个位置，并从文件中读取数据块。

此图显示了此测试对几个不同尺寸，有500个搜索：

Indexed gzip performance

致谢

indexed_gzip项目基于zran.c示例（由mark编写 alder）它与zlib源代码一起提供。

indexed_gzip最初的灵感来自Zalan Rajna的（@zrajna） zindex项目：

Z. Rajna, A. Keskinarkaus, V. Kiviniemi and T. Seppanen
"Speeding up the file access of large compressed NIfTI neuroimaging data"
Engineering in Medicine and Biology Society (EMBC), 2015 37th Annual
International Conference of the IEEE, Milan, 2015, pp. 654-657.

https://sourceforge.net/projects/libznzwithzindex/

关于indexed_gzip的初始工作在 Brainhack巴黎，巴斯德学院， 2016年2月24日至26日，在 FMRIB Centre，在英国牛津大学。

多亏了以下捐款人RS（按时间顺序列出）：

zalan rajna（@zrajna）：错误修复（2）
martin craig（@mcraig ibme）：将indexed_gzip移植到windows（3）
chris markiewicz（@effigies）：删除文件句柄的选项（6）
omer ozarslan（@ozars）：索引导入/导出（8）

许可证

indexed_gzip继承zlib许可证，可用于仔细阅读LICENSE文件。

欢迎加入QQ群-->： 979659372

indexed-gzip 0.8.10

indexed-gzip的Python项目详细描述

索引的

在python中快速随机访问gzip文件
Overview
Installation
Usage
Using with ^{}
Index import/export
Write support
Performance
Acknowledgements
License

概述

安装

用法

与`nibabel`

索引导入/导出

写入支持

性能

致谢

许可证

推荐PyPI第三方库

pyFneko

tilemapbase

python-framingham10yr

aelog

dspam-milter

collective.redirector

GjertsenTweet

possel-server

sugarcrm

smshub-org

django-simple-invoice

pymkv

pytopo

camel

otter-report

导航栏

项目链接

标签

维护者

最新PyPI项目

最新Python常见问题

indexed-gzip 0.8.10

indexed-gzip的Python项目详细描述

索引的 在python中快速随机访问gzip文件OverviewInstallationUsageUsing with ^{}Index import/exportWrite supportPerformanceAcknowledgementsLicense

概述

安装

用法

与nibabel

索引导入/导出

写入支持

性能

致谢

许可证

推荐PyPI第三方库

pyFneko

tilemapbase

python-framingham10yr

aelog

dspam-milter

collective.redirector

GjertsenTweet

possel-server

sugarcrm

smshub-org

django-simple-invoice

pymkv

pytopo

camel

otter-report

导 航 栏

项目 链接

标 签

维护者

最新PyPI项目

最新Python常见问题

索引的

在python中快速随机访问gzip文件
Overview
Installation
Usage
Using with ^{}
Index import/export
Write support
Performance
Acknowledgements
License

与`nibabel`

导航栏

项目链接

标签