我有一个Django应用程序,可以运行一些芹菜链接任务。 第一个任务向一个外部应用程序请求它的DB转储,第二个任务下载它,第三个任务给我带来问题
proc = Popen([
"mysql", "-h{}".format(settings.MYSQL_HOST), "-u{}".format(settings.MYSQL_USER),
"-p{}".format(settings.MYSQL_PASSWORD), snapshot_restore.database_name
], stdout=PIPE, stderr=PIPE, stdin=sed_dumpfile)
这个过程大约需要8小时
snapshot_restore = SnapshotRestore.objects.get(id=snapshot_restore_id)
snapshot_restore.status = "processing"
snapshot_restore.save()
# Here I make the unzip, the sed and the restore
snapshot_restore.status = "finished"
snapshot_restore.database_status = "active"
snapshot_restore.save()
但当我有这个错误的时候
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] File "/usr/local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 266, in _set_autocommit
self.connection.autocommit(autocommit)
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 256, in autocommit
_mysql.connection.autocommit(self, on)
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] _mysql_exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query')
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] The above exception was the direct cause of the following exception:
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] Traceback (most recent call last):
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] File "/srv/src/core/tasks.py", line 243, in restore_snapshot_task
snapshot_restore.save()
try:
snapshot_restore.save()
except Exception as e:
logger.error('Error updating snapshot_restore status, will retry.')
snapshot_restore = SnapshotRestore.objects.get(id=snapshot_restore_id)
snapshot_restore.status = "finished"
snapshot_restore.database_status = "active"
snapshot_restore.save()
但还是有同样的错误
[2020-04-19 11:33:20,701: INFO/ForkPoolWorker-2] core.tasks.restore_snapshot_task[None]: restore complete in 28783.03s
[2020-04-19 11:33:20,701: ERROR/ForkPoolWorker-2] core.tasks.restore_snapshot_task[None]: Error updating snapshot_restore status, will retry.
[2020-04-19 11:33:20,720: WARNING/ForkPoolWorker-2] --- Logging error ---
[2020-04-19 11:33:20,722: WARNING/ForkPoolWorker-2] Traceback (most recent call last):
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] File "/usr/local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 266, in _set_autocommit
self.connection.autocommit(autocommit)
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] File "/usr/local/lib/python3.6/site-packages/MySQLdb/connections.py", line 256, in autocommit
_mysql.connection.autocommit(self, on)
[2020-04-19 11:33:20,723: WARNING/ForkPoolWorker-2] _mysql_exceptions.OperationalError: (2013, 'Lost connection to MySQL server during query')
我尝试增加MySql空闲连接时间(interactive_timeout),因为这是一个AWS RDS实例,我必须通过参数组来完成,如我在这里发现的:
Amazon RDS unable to execute SET GLOBAL command
我将其设置为12小时,然后终止一个空闲连接,但仍然得到相同的“断开连接”错误
我写了这个函数
def keep_connection_alive(id, stop):
logger.info("Starting keep_connection_alive iteration id: " + str(id))
while True:
time.sleep(600)
try:
SnapshotRestore.objects.get(id=randrange(10))
except Exception as e:
logger.error('Error keeping alive: ' + str(e))
if stop():
print('Stopping keep_connection_alive')
break
logger.info('keep_connection_alive out of while')
并在任务开始时这样称呼它
stop_thread = False
keep_alive_thread = threading.Thread(target=keep_connection_alive, args=(1, lambda: stop_thread))
keep_alive_thread.start()
我有很多
[2020-04-19 07:46:47,523: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: SnapshotRestore matching query does not exist.
因为大多数时候,身份证都不存在。我运行了这段代码两次,第一次是在启动进程8小时后出现错误
[2020-04-18 10:28:35,893: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: SnapshotRestore matching query does not exist.
[2020-04-18 10:38:36,016: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: SnapshotRestore matching query does not exist.
[2020-04-18 10:52:56,766: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: (2013, 'Lost connection to MySQL server during query')
[2020-04-18 11:02:56,867: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: (2006, 'MySQL server has gone away')
[2020-04-18 11:12:56,955: ERROR/ForkPoolWorker-2] ???[???]: Error keeping alive: (2006, 'MySQL server has gone away')
第二次我得到它时,它正试图保存恢复
snapshot_restore.save()
我只剩下一个想法了,那就是触发另一个芹菜任务,它只是改变了模型的状态,所以信号也会被触发,下一个任务就会开始。 但除了测试和祈祷它能起作用之外,我想了解为什么会发生这种情况,以及是否有可能的解决办法
提前谢谢
目前没有回答
相关问题 更多 >
编程相关推荐