如何在Python中通过URL访问S3文件？

36 投票

6 回答

85592 浏览

提问于 2025-04-16 11:48

我想写一个Python脚本，能够通过文件的URL来读取和写入s3上的文件，比如说's3:/mybucket/file'。这个脚本需要能够在本地和云端运行，而且不需要修改任何代码。有没有办法做到这一点？

补充说明：这里有一些不错的建议，但我真正想要的是能够做到以下这一点：

 myfile = open("s3://mybucket/file", "r")

然后像使用其他文件对象一样使用这个文件对象。这样就太酷了。如果没有这样的东西，我可能会自己写一个。我可以在simples3或boto上构建这个抽象层。

脚本自动化数据读取文件对象 s3 boto 云存储 URL访问

6 个回答

我没有看到有什么可以直接处理S3网址的东西，不过你可以使用一个S3访问库（比如simples3看起来不错），再加上一些简单的字符串处理：

>>> url = "s3:/bucket/path/"
>>> _, path = url.split(":", 1)
>>> path = path.lstrip("/")
>>> bucket, path = path.split("/", 1)
>>> print bucket
'bucket'
>>> print path
'path/'

回答于 2025-04-16 由 Python大师

分享举报

这是他们在 awscli中实现的方法：

def find_bucket_key(s3_path):
    """
    This is a helper function that given an s3 path such that the path is of
    the form: bucket/key
    It will return the bucket and the key represented by the s3 path
    """
    s3_components = s3_path.split('/')
    bucket = s3_components[0]
    s3_key = ""
    if len(s3_components) > 1:
        s3_key = '/'.join(s3_components[1:])
    return bucket, s3_key


def split_s3_bucket_key(s3_path):
    """Split s3 path into bucket and key prefix.
    This will also handle the s3:// prefix.
    :return: Tuple of ('bucketname', 'keyname')
    """
    if s3_path.startswith('s3://'):
        s3_path = s3_path[5:]
    return find_bucket_key(s3_path)

你可以用这样的代码来直接使用它

from awscli.customizations.s3.utils import split_s3_bucket_key
import boto3
client = boto3.client('s3')
bucket_name, key_name = split_s3_bucket_key(
    's3://example-bucket-name/path/to/example.txt')
response = client.get_object(Bucket=bucket_name, Key=key_name)

这段内容并没有完全解决如何将s3的键当作文件对象来操作的问题，但这是朝着那个方向迈出的一步。

回答于 2025-04-16 由 Python大师

分享举报

要打开文件，其实很简单：

import urllib
opener = urllib.URLopener()
myurl = "https://s3.amazonaws.com/skyl/fake.xyz"
myfile = opener.open(myurl)

如果文件是公开的，这样就可以在s3上使用。

使用boto写文件的方式大概是这样的：

from boto.s3.connection import S3Connection
conn = S3Connection(AWS_KEY, AWS_SECRET)
bucket = conn.get_bucket(BUCKET)
destination = bucket.new_key()
destination.name = filename
destination.set_contents_from_file(myfile)
destination.make_public()

如果这个方法对你有用，记得告诉我哦 :)

回答于 2025-04-16 由 Python大师

分享举报

如何在Python中通过URL访问S3文件？

6 个回答

撰写回答