从气流上的egg文件加载config.jon

2024-05-07 23:53:17 发布

您现在位置:Python中文网/ 问答频道 /正文

如何从egg文件加载配置文件?我正在尝试运行python代码,该代码是以文件包的形式在Airflow上运行的。在代码中,它尝试加载一个config.json文件,但该文件无法在Airflow上运行。我猜问题是它试图从egg文件中读取文件,但由于它已压缩,因此找不到它。我更新了setup.py,如下所示,以确保配置文件位于pckage中:

from setuptools import find_packages, setup

setup(
    name='tv_quality_assurance',
    packages=find_packages(),
    version='0.1.0',
    description='Quality checks on IPTV linear viewing data',
    author='Sarah Berenji',
    data_files=[('src/codes', ['src/codes/config.json'])],
    include_package_data=True,
    license='',
)

现在它抱怨config_file_path不是目录:

NotADirectoryError: [Errno 20] Not a directory: '/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json'

我检查了路径,json文件就在那里。下面是我的代码,在调试中添加了一些print语句,这表明它没有将config_file_path视为文件或目录:

dir_path = os.path.dirname(__file__)
config_file_path = dir_path + '/config.json'

print(f"config_file_path = {config_file_path}")
print(f"relpath(config_file_path) = {os.path.relpath(config_file_path)}")

if not os.path.isfile(config_file_path):
    print(f"{config_file_path} is not a file")
if not os.path.isdir(config_file_path):
    print(f"{config_file_path} is not a dir")

with open(config_file_path) as json_file:
    config = json.load(json_file)

它返回以下输出:

config_file_path = /opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json
relpath(config_file_path) = ../../artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json
/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json is not a file
/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json is not a dir

Traceback (most recent call last):
File "/opt/test_AF1.10.2_py2/dags/py_spark_entry_point.py", line 8, in <module>
execute(spark)
File "/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/entry_point.py", line 26, in execute
File "/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/data_methods.py", line 32, in load_config_file
NotADirectoryError: [Errno 20] Not a directory: '/opt/artifacts/project-0.1.0.dev8-py3.6.egg/src/codes/config.json'

作为我的下一次尝试,我尝试使用importlib_resources,但最终出现了一个奇怪的错误,即模块没有安装,但日志显示它是由pip成功安装的:ModuleNotFoundError: No module named 'importlib_resources'

import importlib_resources

config_file = importlib_resources.files("src.codes") / "config.json"
with open(config_file) as json_file:
    config = json.load(json_file)

Tags: 文件pathpysrcprojectconfigjsonegg