如何在多个芹菜任务中组合实施普罗米修斯监控?

2024-04-25 23:51:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个运行多(3)个芹菜工人的设置,我有8个不同的任务: -芹菜 -高频作业:任务1、任务2 -低频作业:任务3-8 每个人都在自己的库伯内特斯吊舱里

我想用普罗米修斯实施监控。为此,我正在使用库prometheus_client

from celery import Celery, signals
from prometheus_client import start_http_server as start_prometheus_http_server

REDIS_HOST = os.environ.get("REDIS_HOST", "localhost")
BROKER_URL = f"redis://{REDIS_HOST}:6379/0"
app = Celery("tasks", broker=BROKER_URL)

app.conf.task_routes = {
    "hifreq.main": {"queue": "main_queue"},
    "hifreq.final": {"queue": "final_queue"},
    "lowfreq.*": {"queue": "lowfreq_queue"},
}

@signals.celeryd_after_setup.connect
def setup_direct_queue(sender, instance, **kwargs):
    start_prometheus_http_server(9090)

@app.task(name="hifreq.main")
def long_running_task():
    data_loading()

DATA_LOADING_TIME = Summary(
    "data_loading_seconds",
    "Time spent loading the data",
)

@DATA_LOADING_TIME.time()
def data_loading():
    pass

这将启动prometheus服务器(我认为它会为每个工作人员启动一个)。我已经通过入口/服务将其公开,以便能够访问服务器,当我导航到运行“hifreq”worker的pod时,我得到:

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 5828.0
python_gc_objects_collected_total{generation="1"} 1643.0
python_gc_objects_collected_total{generation="2"} 294.0
# HELP python_gc_objects_uncollectable_total Uncollectable object found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 152.0
python_gc_collections_total{generation="1"} 13.0
python_gc_collections_total{generation="2"} 2.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="6",patchlevel="9",version="3.6.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 3.19164416e+08
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.4453888e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.58876788561e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 2.1
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 30.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 1.048576e+06

这是默认的Python度量,但不是我自己定义的预期的data_loading_seconds度量。我怀疑多个工作人员各自拥有自己的服务器时出了问题,但我不太确定到底出了什么问题。感谢您的帮助


Tags: bytesobjectstimequeuetypehelpprocessstart