用于像文件系统一样表示ibmcloudobjectstorage(cos)bucket的包。
ibm-cos-simple-fs的Python项目详细描述
IBM云对象存储简单文件系统库
IBM Boto3库的问题
ibmcloud云对象服务对bucket下的对象有非常糟糕的表示。
例如,如果您使用的是ibm_boto3,那么您只能列出bucket下的所有对象,这样您可能最终会 在该bucket中有一个python对象键列表,例如:
["source/","source/year=2018/","source/year=2018/month=08/","source/year=2018/month=08/day=28/","source/year=2018/month=08/day=28/test1.txt","source/year=2018/month=08/day=28/test.txt","source/year=2018/month=08/day=29/","source/year=2018/month=08/day=29/test.txt","source/year=2018/month=08/day=30/","source/year=2018/month=08/day=30/test.txt","source/year=2018/month=08/day=31/","source/year=2018/month=08/day=31/test.txt","source/year=2019/month=01/day=01/","source/year=2019/month=01/day=01/test.txt"]
当您试图获取“结束文件”(即上面示例中的test.txt文件)时,这非常糟糕;而且很难理解这个bucket的结构。
这个图书馆做什么?
这个库在boto3之上添加了一个表示层,它将bucket(即python列表)中对象的平面表示表示表示为树状数据结构。因此,水桶 可以用文件夹/目录和文件的概念表示为文件系统。目前,这个库能够模拟文件 系统命令,如“cd”和“ls”。这个库还提供了获取bucket中文件的api,请参阅下一节中的用法。
对于上述bucket对象,此库将其表示为
test-bucket/
└─ source/
└─ year=2018/
└─ month=08/
└─ day=28/
└─ test1.txt
└─ test.txt
└─ day=29/
└─ test.txt
└─ day=30/
└─ test.txt
└─ day=31/
└─ test.txt
└─ year=2019/
└─ month=01/
└─ day=01/
└─ test.txt
概念
- boto3对象和键:在boto3中,对象的键被表示为类似“test bucket/source/”的字符串。
- 路径:这个简单fs中的路径以bucket名称开头,例如路径“test bucket/source/”表示boto3对象“source/”。
- 叶:叶是cosbucketreenode对象,其boto3对象表示不是任何其他boto3对象的公共前缀。例如,上面示例中的“source/year=2018/month=08/day=28/test1.txt”是一个leaf的boto3对象表示。
安装
PYPI提供项目:https://pypi.org/project/ibm-cos-simple-fs/
pip install ibm-cos-simple-fs
用法
注意,从这个库输出的路径总是附加bucket name,因此形式是“bucket_name/path/to/your/stuff.txt”。 在BOTO3库中使用密钥名时,您应该对路径进行后期处理,以忽略“bucket_name/”部分。
> from ibm_cos_fs.bucket_tree import COSBucketTree
# Given flat_object_list being the one in Problem statement, building a tree structure using:
> tree = COSBucketTree(bucket_name='test-bucket', object_list=flat_object_list) # flat_object_list should be a list of strings
# Get all leaves as a list of path strings
> leaf_paths = tree.get_leaf_paths()
# Print the tree representation of the file system structure
> tree.print()
# To get all the children nodes of a given boto3 object key, say 'source/year=2018/month=8/'
> node = tree.get_node_from_key('source/year=2018/month=8/') # This is to simulate 'cd source/year=2018/month=8/' in a file system.
> node.children # To get the children_node as a {name: TreeNode} map
# Or
> node.list_children() # To get a list of children as string. This is to simulate 'ls source/year=2018/month=8/' in a file system.
# This library also provides APIs to get leaves under a node
# To get all the leaves under a given object key, say 'source/year=2018/month=8/day=29/'
# Firstly, find the node for this key
> node = tree.get_node_from_key('source/year=2018/month=8/day=29/')
> node.is_dir # will show whether or not current node is a directory
# Then, get the leaves nodes from a specific node
> leaf_nodes = tree.get_leaves(node) # Note, all_leaves = tree.get_leaves() will return all leaves from root
# You can then convert them to string representation (i.e. paths) by
> [str(l) for l in leaf_nodes] # or [l.path for l in leaf_nodes]
# In addition, you can get leaves as boto3 object keys
> [l.key for l in leaf_nodes]
# To get the directory that contains all given leaves; this is reverse operation to get_leaves(common_parent_node)
> common_parent_node = tree.get_common_parent_for_leaves(leaf_nodes)
创建者
版权所有©2019 Shengyi Pan