将Azure存储容器装载到Databricks工作区/笔记本导致AttributeError

2024-06-12 21:45:41 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试使用密钥保险库支持的秘密作用域将Azure Blob存储容器装载到Databricks工作簿

设置:

  1. 创建了一个密钥库
  2. 在密钥库中创建了一个秘密
  3. 创建了Databricks的秘密作用域
  • 这是众所周知的。
    • 运行dbutils.secrets.get(scope = dbrick_secret_scope, key = dbrick_secret_name)不会导致任何错误
    • 在Databricks中查看机密会导致[REDACTED]

数据块中的单元格:

%python

dbrick_secret_scope = "dbricks_kv_dev"
dbrick_secret_name = "scrt-account-key"

storage_account_key = dbutils.secrets.get(scope = dbrick_secret_scope, key = dbrick_secret_name)
storage_container = 'abc-test'
storage_account = 'stgdev'

dbutils.fs.mount(
    source = f'abfss://{storage_container}@{storage_account}.dfs.core.windows.net/',
    mount_point = f'/mnt/{storage_account}',
    extra_configs = {f'fs.azure.accountkey.{storage_account}.dfs.core.windows.net:{storage_account_key}'}
)

结果:

  • 错误:AttributeError: 'set' object has no attribute 'keys',其中dbutils.fs.mount()mount_point行以红色突出显示
  • 完全错误:
AttributeError                            Traceback (most recent call last)
<command-3166320686381550> in <module>
      9     source = f'abfss://{storage_container}@{storage_account}.dfs.core.windows.net/',
     10     mount_point = f'/mnt/{storage_account}',
---> 11     extra_configs = {f'fs.azure.accountkey.{storage_account}.dfs.core.windows.net:{storage_account_key}'}
     12 )

/local_disk0/tmp/1625601199293-0/dbutils.py in f_with_exception_handling(*args, **kwargs)
    298             def f_with_exception_handling(*args, **kwargs):
    299                 try:
--> 300                     return f(*args, **kwargs)
    301                 except Py4JJavaError as e:
    302                     class ExecutionError(Exception):

/local_disk0/tmp/1625601199293-0/dbutils.py in mount(self, source, mount_point, encryption_type, owner, extra_configs)
    389                 self.check_types([(owner, string_types)])
    390             java_extra_configs = \
--> 391                 MapConverter().convert(extra_configs, self.sc._jvm._gateway_client)
    392             return self.print_return(self.dbcore.mount(source, mount_point,
    393                                                        encryption_type, owner,

/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_collections.py in convert(self, object, gateway_client)
    520         HashMap = JavaClass("java.util.HashMap", gateway_client)
    521         java_map = HashMap()
--> 522         for key in object.keys():
    523             java_map[key] = object[key]
    524         return java_map

AttributeError: 'set' object has no attribute 'keys'

似乎与extra_configs参数有关,但我不确定具体是什么。有人能看到我遗漏了什么吗


Tags: keyinselfsecretobjectstorageaccountjava
1条回答
网友
1楼 · 发布于 2024-06-12 21:45:41

在您的案例中,真正的错误是您需要提供dictionary作为extra_configs参数,但您提供的是set:{f'fs.azure.accountkey.{storage_account}.dfs.core.windows.net:{storage_account_key}'}-这是因为您没有正确的语法(缺少两个')。正确的语法是:{f'fs.azure.accountkey.{storage_account}.dfs.core.windows.net':storage_account_key}

但实际上,您不能通过使用存储帐户密钥来使用abfss协议装载—它只支持使用wasbs协议装载。对于abfss,您必须使用服务主体,并提供其ID&;秘密,像这样(见documentation):

configs = {"fs.azure.account.auth.type": "OAuth",
          "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
          "fs.azure.account.oauth2.client.id": "<application-id>",
          "fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope="<scope-name>",key="<service-credential-key-name>"),
          "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/<directory-id>/oauth2/token"}

dbutils.fs.mount(
  source = "abfss://<container-name>@<storage-account-name>.dfs.core.windows.net/",
  mount_point = "/mnt/<mount-name>",
  extra_configs = configs)

虽然从理论上讲,您可以使用wasbs协议和存储密钥装载ADLS Gen2存储,但不建议这样做,因为您可能会遇到问题(我个人认为)。另外,不建议使用存储密钥,最好使用共享访问签名,这样更安全

相关问题 更多 >