YoloV5在容器中挂起

-1 投票
1 回答
28 浏览
提问于 2025-04-12 17:51

我在把docker、python(3.11)和YOLOV5结合起来时遇到了麻烦。

我把问题简化成了下面这个脚本,在Windows 10下调试时一切正常,但在docker中运行时就卡住了(下面是输出内容)。在调试器中运行时,它会打印finished并等待ctrl+c,这很正常。如果我去掉最后的无限循环,它就会正常运行,先打印sleepy time,然后打印finished只有在有循环的情况下,它在加载模型的过程中会卡住。这对我来说是个问题,因为这个程序应该是一个更大系统的一部分,它会等待新文件出现,并在文件出现时运行模型(使用watchdog.observers库)。

更复杂的是,python并不是我的第一语言,而且这个模型也不是我自己创建的。

from time import sleep

import torch
import os
import pathlib


def start_watching():
    print("test no watchdog, yes loop")    

    try:        
        if os.name == 'nt':
            pathlib.PosixPath = pathlib.WindowsPath
        #hardcoded paths for testing only
        model = torch.hub.load('./yolov5-master', 'custom', source='local', path='./best.pt', force_reload=True) 
        print("got model")
        results = model('./240206-154354_wheelset_149_22241687_Image_02.jpg')        
        labels, cord = results.xyxyn[0][:, -1], results.xyxyn[0][:, :-1]
        print(labels,cord)
    except Exception as ex:
        print(ex)
    
    print("sleepy time") 
    sleep(100)
    print("Finished")
    
    waitForFiles = True
    try:
        while waitForFiles:
            sleep(10)
    except KeyboardInterrupt:
        waitForFiles = False
        print("going down")


start_watching()

但是,当我在容器中运行这个时,我得到的输出是:

test no watchdog, yes loop
YOLOv5  2024-3-5 Python-3.11.6 torch-2.2.1+cpu CPU

Fusing layers...
Model summary: 157 layers, 7012822 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape...

它永远不会(无论我等多久)超过Adding AutoShape,这是在加载模型的过程中输出的。

我的dockerfile是:

FROM yolobase
#Copy the code in
COPY . .
RUN echo "min not working"

ENTRYPOINT python3 test.py

而上面提到的yolobase 镜像是根据用于构建模型的yolo版本附带的docker文件构建的(我被告知这很重要)。

# YOLOv5  by Ultralytics, AGPL-3.0 license
# Builds ultralytics/yolov5:latest-cpu image on DockerHub https://hub.docker.com/r/ultralytics/yolov5
# Image is CPU-optimized for ONNX, OpenVINO and PyTorch YOLOv5 deployments

# Start FROM Ubuntu image https://hub.docker.com/_/ubuntu
FROM ubuntu:23.10

# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/

# Install linux packages
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
    && apt install --no-install-recommends -y python3-pip git zip curl htop libgl1 libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3

# Remove python3.11/EXTERNALLY-MANAGED or use 'pip install --break-system-packages' avoid 'externally-managed-environment' Ubuntu nightly error
RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED

# Install pip packages
COPY requirements.txt .
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -r requirements.txt albumentations gsutil notebook \
    coremltools onnx onnx-simplifier onnxruntime 'openvino-dev>=2023.0' \
    # tensorflow tensorflowjs \
    --extra-index-url https://download.pytorch.org/whl/cpu

# Create working directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app

# Copy contents
COPY . /usr/src/app


# Usage Examples -------------------------------------------------------------------------------------------------------

# Build and Push
# t=ultralytics/yolov5:latest-cpu && sudo docker build -f utils/docker/Dockerfile-cpu -t $t . && sudo docker push $t

# Pull and Run
# t=ultralytics/yolov5:latest-cpu && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/datasets:/usr/src/datasets $t

不过,如果我使用FROM ultralytics/yolov5:latest-cpu,我也会遇到同样的问题。

第二个dockerfile(即yolobase)使用的依赖项是:

# YOLOv5 requirements

# Usage: pip install -r requirements.txt

# Base ------------------------------------------------------------------------
gitpython>=3.1.30
matplotlib>=3.3
numpy>=1.23.5
opencv-python>=4.1.1
Pillow>=9.4.0
psutil  # system resources
PyYAML>=5.3.1
requests>=2.23.0
scipy>=1.4.1
thop>=0.1.1  # FLOPs computation
torch>=1.8.0  # see https://pytorch.org/get-started/locally (recommended)
torchvision>=0.9.0
tqdm>=4.64.0
ultralytics>=8.0.232
# protobuf<=3.20.1  # https://github.com/ultralytics/yolov5/issues/8012

# Logging ---------------------------------------------------------------------
# tensorboard>=2.4.1
# clearml>=1.2.0
# comet

# Plotting --------------------------------------------------------------------
pandas>=1.1.4
seaborn>=0.11.0

# Export ----------------------------------------------------------------------
# coremltools>=6.0  # CoreML export
# onnx>=1.10.0  # ONNX export
# onnx-simplifier>=0.4.1  # ONNX simplifier
# nvidia-pyindex  # TensorRT export
# nvidia-tensorrt  # TensorRT export
# scikit-learn<=1.1.2  # CoreML quantization
# tensorflow>=2.4.0,<=2.13.1  # TF exports (-cpu, -aarch64, -macos)
# tensorflowjs>=3.9.0  # TF.js export
# openvino-dev>=2023.0  # OpenVINO export

# Deploy ----------------------------------------------------------------------
setuptools>=65.5.1 # Snyk vulnerability fix
# tritonclient[all]~=2.24.0

# Extras ----------------------------------------------------------------------
# ipython  # interactive notebook
# mss  # screenshots
# albumentations>=1.0.3
# pycocotools>=2.0.6  # COCO mAP

注意,评论是从github上下载的,我没有注释掉任何内容,也没有添加任何内容。

1 个回答

1

这个脚本是从一个我在stackoverflow上找到的解决方案改编而来的,它使用了一个叫做watchdog的库来监控一个文件夹的变化。当检测到有新图片时,它会执行一个命令,用YOLOv5模型来处理这张图片。这个过程应该很简单:一旦发现新图片,脚本就会运行检测模型,然后等待下一张图片的到来。

#!/usr/bin/python
import time
import os
from watchdog.observers import Observer
from watchdog.events import FileSystemEventHandler

class MyHandler(FileSystemEventHandler):
    def run_yolo(self, path):
        if path.endswith('jpg'):
            cmd = f'python3 detect.py --source {path} --weights yolov5s.pt'
            print("Running command: ", cmd)
            os.system(cmd)

    def on_created(self, event):
        print(f'event type: {event.event_type}  path : {event.src_path}')
        self.run_yolo(event.src_path)

    def on_modified(self, event):
        pass
    
    def on_moved(self, event):
        pass


if __name__ == "__main__":
    event_handler = MyHandler()
    observer = Observer()
    observer.schedule(event_handler, path='imgs/', recursive=False)
    observer.start()

    try:
        while True:
            time.sleep(1)
    except KeyboardInterrupt:
        observer.stop()
    observer.join()

为了测试,我用wget下载了一张图片(wget https://predictivehacks.com/wp-content/uploads/2019/10/cycling001-1024x683.jpg),这成功触发了脚本的运行。但是,我遇到了一个意外的情况:脚本对同一张图片执行了三次检测。输出结果清楚地显示,检测命令被多次触发,每次都检测到图片中的相同物体,并将结果保存到一个新的目录中。

event type: modified  path : imgs
event type: modified  path : imgs/cycling001-1024x683.jpg
Running command:  python3 detect.py --source imgs/cycling001-1024x683.jpg --weights yolov5s.pt
detect: weights=['yolov5s.pt'], source=imgs/cycling001-1024x683.jpg, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-3-26 Python-3.11.6 torch-2.2.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /usr/src/app/imgs/cycling001-1024x683.jpg: 448x640 1 person, 2 bicycles, 1 car, 42.0ms
Speed: 0.3ms pre-process, 42.0ms inference, 0.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp2
event type: modified  path : imgs/cycling001-1024x683.jpg
Running command:  python3 detect.py --source imgs/cycling001-1024x683.jpg --weights yolov5s.pt
detect: weights=['yolov5s.pt'], source=imgs/cycling001-1024x683.jpg, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-3-26 Python-3.11.6 torch-2.2.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /usr/src/app/imgs/cycling001-1024x683.jpg: 448x640 1 person, 2 bicycles, 1 car, 36.0ms
Speed: 0.3ms pre-process, 36.0ms inference, 0.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp3
event type: modified  path : imgs
event type: modified  path : imgs/cycling001-1024x683.jpg
Running command:  python3 detect.py --source imgs/cycling001-1024x683.jpg --weights yolov5s.pt
detect: weights=['yolov5s.pt'], source=imgs/cycling001-1024x683.jpg, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5  2024-3-26 Python-3.11.6 torch-2.2.1+cpu CPU

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
image 1/1 /usr/src/app/imgs/cycling001-1024x683.jpg: 448x640 1 person, 2 bicycles, 1 car, 36.9ms
Speed: 0.3ms pre-process, 36.9ms inference, 0.6ms NMS per image at shape (1, 3, 640, 640)
Results saved to runs/detect/exp4

这种重复的情况并不是我想要的,我很困惑为什么脚本会对同一张图片的修改反应多次。这个问题似乎和watchdog事件的触发或处理方式有关,而不是YOLOv5模型本身。我在考虑调整脚本,确保每张图片只处理一次,可能通过实现一个检查机制,忽略对同一张图片的后续检测,除非它再次被修改。

补充说明:这种意外的重复似乎与wget在下载图片时的行为有关,它可能会创建临时文件。这种行为可能无意中触发了多个事件。经过尝试用mv命令将文件移动到文件夹中后,脚本似乎按预期工作,每张图片只处理一次。

+补充说明:在代码中切换到on_created事件处理器,可以在使用wget下载图片到监控目录时确保安全。

我测试的这种方法是否能帮助解决我在Docker容器中使用YOLOv5时遇到的卡顿问题,还是说有更有效的策略来确保新图片的顺利和单次处理呢?

撰写回答