Python脚本随机失败我可以用什么工具来确定原因?

2024-04-24 20:37:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个python脚本,作为池自动化项目的一部分。随着时间的推移,我对它做了很多修改,改进了它,并在功能上增加了。因此,直到最近,当我(几乎)把它拿到我想要的地方时,我才有机会让它长时间运行。现在我让它一直运行,它会随机失败并重新启动(通过看门狗支持)。在

我通过systemd在Raspberry Pi3上运行这个脚本,它包含了watchdog支持,因为我希望/需要它一直运行。当脚本出现故障时,看门狗会捕捉到它,并像预期的那样重新启动它,但我宁愿首先找出是什么导致脚本失败。在

这个脚本连接到一个mysql数据库,获取一些关于游泳池水位的信息,以及我的泳池水泵使用了多少瓦特的信息,然后确定是否需要填充游泳池。如果我们这样做,我们用一个继电器打开一个连接到游泳池的喷水阀,否则我们什么都不做。我们还要检查洒水器是否在运行,泳池泵是否在运行,是否有人抛出了物理隔离开关。它有许多我们使用的状态LED和几个开关,还有一个LCD屏幕,通过串行方式与Pi通信。在

除了sshd和系统之外,这个脚本几乎是唯一在Pi上运行的东西……没有apache,没有节点red,ftp等等。。。在

我有一个ssh会话对Pi打开,这个会话从不失败,即使脚本失败。对pi的连续ping显示零数据包丢失,即使脚本失败。当脚本失败并重新启动时,mysyslog显示以下内容:

Jun  6 08:08:56 scruffy systemd[1]: Unit pool_control.service entered failed state.
Jun  6 08:08:57 scruffy systemd[1]: pool_control.service holdoff time over, scheduling restart.
Jun  6 08:08:57 scruffy systemd[1]: Stopping Installing Python script for Pool Fill Control /w watchdog...
Jun  6 08:08:57 scruffy systemd[1]: Starting Installing Python script for Pool Fill Control /w watchdog...
Jun  6 08:08:58 scruffy systemd[1]: Started Installing Python script for Pool Fill Control /w watchdog.
Jun  6 08:08:58 scruffy kernel: [34864.219647] gpiomem-bcm2835 3f200000.gpiomem: gpiomem device opened.

当脚本失败并重新启动时,dmesg会显示:

^{pr2}$

我的程序日志没有显示任何异常:

2016-06-06 13:26:24,387 INFO Notify socket = /run/systemd/notify
2016-06-06 13:26:24,616 DEBUG PushBullet Notification Sent - Pool fill control started successfully
2016-06-06 13:26:24,617 INFO pool_fill_control.py V2.6 (2016-06-05) started
2016-06-06 13:26:25,182 DEBUG Sprinklers are not running (RACHIO).
2016-06-06 13:26:25,183 DEBUG SPRINKLER_RUN_LED should be OFF. This is a BLUE LED
2016-06-06 13:26:25,184 DEBUG Watchdog Ping Sent
2016-06-06 13:26:25,611 DEBUG get_pool_level returned 1
2016-06-06 13:26:25,764 DEBUG pool_pump_running_watts returned 12 watts in use by pump.
2016-06-06 13:26:25,765 DEBUG PUMP_RUN_LED should be OFF. This is the YELLOW LED
2016-06-06 13:26:25,766 DEBUG POOL_FILLING_LED should be OFF. This is a BLUE LED
2016-06-06 13:26:25,766 DEBUG Pool Level OK (PFC_LEVEL_OK) sent to MightyHat

脚本运行时,下面是top的输出:

top - 13:29:36 up 15:01,  3 users,  load average: 0.05, 0.07, 0.05
Tasks: 119 total,   1 running, 118 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  1.2 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:    947760 total,   390032 used,   557728 free,   114444 buffers
KiB Swap:   102396 total,        0 used,   102396 free.    97648 cached Mem

和meminfo:

root scruffy: log #  cat /proc/meminfo 
MemTotal:         947760 kB
MemFree:          558160 kB
MemAvailable:     864020 kB
Buffers:          114460 kB
Cached:            97640 kB
SwapCached:            0 kB
Active:           202888 kB
Inactive:          31192 kB
Active(anon):      23672 kB
Inactive(anon):     6140 kB
Active(file):     179216 kB
Inactive(file):    25052 kB
Unevictable:        1744 kB
Mlocked:            1744 kB
SwapTotal:        102396 kB
SwapFree:         102396 kB
Dirty:                16 kB
Writeback:             0 kB
AnonPages:         23844 kB
Mapped:            19188 kB
Shmem:              6424 kB
Slab:             140780 kB
SReclaimable:     132312 kB
SUnreclaim:         8468 kB
KernelStack:        1000 kB
PageTables:          668 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      576276 kB
Committed_AS:      92620 kB
VmallocTotal:    1114112 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
CmaTotal:           8192 kB
CmaFree:            3736 kB

以下是更多系统信息:

root scruffy: log #  uptime
13:41:58 up 15:14,  3 users,  load average: 0.02, 0.04, 0.05

root scruffy: log #  uname -a
Linux scruffy 4.4.9-v7+ #884 SMP Fri May 6 17:28:59 BST 2016 armv7l GNU/Linux

以下是systemd启动/关闭脚本:

# This script starts and stops our pool fill control python script

[Unit]
Description=Installing Python script for Pool Fill Control /w watchdog
Requires=basic.target
After=multi-user.target

[Service]
Type=notify
WatchdogSec=70s
ExecStart=/usr/bin/python /root/pool_control/pool_fill_control.py
ExecStop=/root/pool_control/setupgpio.sh
Restart=always

# The number of times the service is restarted within a time period can be set
# If that condition is met, the RPi can be rebooted
#
StartLimitBurst=4
StartLimitInterval=180s
# actions can be none|reboot|reboot-force|reboot-immidiate
StartLimitAction=none

# The following are defined the /etc/systemd/system.conf file and are
# global for all services
#
#DefaultTimeoutStartSec=90s
#DefaultTimeoutStopSec=90s
#
# They can also be set on a per process here:
# if they are not defined here, they fall back to the system.conf values
TimeoutStartSec=2s
TimeoutStopSec=2s

[Install]
WantedBy=multi-user.target

我试着在jessie的新安装上运行它,并将其移动到另一个Pi,结果都是一样的,经过一段不确定的时间后,脚本失败,看门狗重新启动它。在

有问题的脚本很长,所以我不确定在这里发布它的正确过程,但是我在github上有它:

https://github.com/rjsears/Pool_Fill_Control/blob/master/pool_fill_control.py

我正在寻找如何对代码进行故障排除的指导,以确定是什么原因导致了它的失败,或者我是否有一些异常的代码会直接跳到对python比较有经验的人身上。我没有那么多经验,这是我第一个(我认为是真实的)python脚本。在

最终,我想把它与一个内部网站接口,通过一个网页复制物理功能(按钮按下,发光二极管),但我希望脚本工作正常,然后再进一步。在

我们将非常感谢您的帮助或指导!在


Tags: debug脚本forledkbscriptbefill