流式读/写到Google存储blob的同步缓冲。
gs-chunked-io的Python项目详细描述
gs chunked io:Google存储流
gs chunked io为google存储对象提供透明的分块io流。 可写流作为多部分对象进行管理,在流关闭时组合。在
默认情况下IO操作是并发的。可以使用threads
调整并发线程的数量
参数,或使用threads=None
完全禁用。在
import gs_chunked_io as gscio
from google.cloud.storage import Client
client = Client()
bucket = client.bucket("my-bucket")
blob = bucket.get_blob("my-key")
# Readable stream:
with gscio.Reader(blob) as fh:
fh.read(size)
# Writable stream:
with gscio.Writer("my_new_key", bucket) as fh:
fh.write(data)
# Process blob in chunks:
for chunk in gscio.for_each_chunk(blob):
my_chunk_processor(chunk)
# Multipart copy with processing:
dst_bucket = client.bucket("my_dest_bucket")
with gscio.Writer("my_dest_key", dst_bucket) as writer:
for chunk in gscio.for_each_chunk(blob)
process_my_chunk(chunk)
writer(chunk)
# Extract .tar.gz on the fly:
import gzip
import tarfile
with gscio.Reader(blob) as fh:
gzip_reader = gzip.GzipFile(fileobj=fh)
tf = tarfile.TarFile(fileobj=gzip_reader)
for tarinfo in tf:
process_my_tarinfo(tarinfo)
安装
^{pr2}$链接
Bugs
请报告GitHub上的错误、问题、功能请求等。在
- 项目
标签: