为pynamodb提供附加功能
pynamodb-mate的Python项目详细描述
欢迎使用pynamodb_mate文档
特点1。在s3中存储大型二进制对象,在dynamodb中只存储s3 uri
dynamodb对于pay-as-you-go,high-concurrent键值数据库是一个非常好的选择。有时,您希望将大型二进制对象与dynamodb项一起存储。尤其是在网络爬虫应用程序中。但是dynamodb有一个限制,一个项不能大于250kb。你怎么能解决这个问题?
一个简单的解决方案是将大型二进制对象存储在s3中,而只将s3 uri存储在dynamodb中。pynamodb_matelibrary在pynamodbproject(python中的dynamodb orm层)之上提供了这个特性。
下面是如何定义orm层的:
frompynamodb.modelsimportModelfrompynamodb.attributesimportUnicodeAttributefrompynamodb_mate.s3_backed_attributeimport(S3BackedBinaryAttribute,S3BackedUnicodeAttribute,S3BackedMixin,s3_key_safe_b64encode,)BUCKET_NAME="my-bucket"URI_PREFIX="s3://{BUCKET_NAME}/".format(BUCKET_NAME=BUCKET_NAME)classPageModel(Model,S3BackedMixin):classMeta:table_name="pynamodb_mate-pages"region="us-east-1"url=UnicodeAttribute(hash_key=True)cover_image_url=UnicodeAttribute(null=True)# this field is for html content stringhtml_content=S3BackedUnicodeAttribute(s3_uri_getter=lambdaobj:URI_PREFIX+s3_key_safe_b64encode(obj.url)+".html",compress=True,)# this field is for image binary contentcover_image_content=S3BackedBinaryAttribute(s3_uri_getter=lambdaobj:URI_PREFIX+s3_key_safe_b64encode(obj.cover_image_url)+".jpg",compress=True,)
下面是如何将大二进制存储到s3:
url="http://www.python.org"url_cover_image="http://www.python.org/logo.jpg"html_content="Hello World!\n"*1000cover_image_content=("this is a dummy image!\n"*1000).encode("utf-8")page=PageModel(url=url,cover_image_url=url_cover_image)# create, if something wrong with s3.put_object in the middle,# dirty s3 object will be cleaned uppage.atomic_save(s3_backed_data=[page.html_content.set_to(html_content),page.cover_image_content.set_to(cover_image_content)])# update, if something wrong with s3.put_object in the middle,# partially done new s3 object will be roll backhtml_content_new="Good Bye!\n"*1000cover_image_content_new=("this is another dummy image!\n"*1000).encode("utf-8")page.atomic_update(s3_backed_data=[page.html_content.set_to(html_content_new),page.cover_image_content.set_to(cover_image_content_new),])# delete, make sure s3 object are all gonepage.atomic_delete()
安装
pynamodb_mate在pypi上发布,所以您只需要:
$ pip install pynamodb_mate
要升级到最新版本:
$ pip install --upgrade pynamodb_mate