提交的Snakemake规则开始运行并立即失败，但Snakemake继续运行没有抛出任何错误

Question

我为我的项目写了一个snakemake流程，其中一部分看起来像这样：

SAMPLES, = glob_wildcards("/absolute/path/to/samples/{sample}.bam")
rule all:
    input:
       expand("splits/sample_check/{sample}_done.txt", sample=SAMPLES)

rule GVCFSplit:
    input:
        "gvcf/{SAMPLES}/",
        #"chr_pos_test/chr{c}/chr{c}_reg{i}.txt"
    output:
        "splits/sample_check/{SAMPLES}_done.txt"
    log:
        "logs/GVCFSplit/{SAMPLES}_done.log"
    benchmark:
        "benchmarks/GVCFSplit/{SAMPLES}_done.benchmark.txt"
    envmodules:
        "bcftools"
    resources:
        mem='1g',
        time='4:00:00',
        threads=1
    shell:
        r"""
            python3 /absolute/path/to/python/script/GVCF_split.py {wildcards.SAMPLES}
        """

这个规则利用下面的python脚本把每个染色体的文件分成50 Mb的小块：

from pathlib import Path 
import subprocess
from sys import argv
import os,sys

sample_id=argv[1].strip()
chrs=list(range(1,23))

for c in chrs:
    sample_file=("/path/to/chromosome/files/per/sample/%s/%s_chr%i.g.vcf.gz") % (sample_id,sample_id,c)
    for r in range(1,chr_reg[c]+1):
        reg_file=("/path/to/per/chromosome/regions/chr%i/chr%i_reg%i.txt") % (c,c,r)
        #out_file=("try/gvcf/splits/chr%i/%s_reg%i.g.vcf.gz") % (c,sample_id,r)
        out_file=("/path/to/spiltted/vcf/files/chr%i/%s_reg%i.g.vcf.gz") % (c,sample_id,r)
        #Path(out_file).touch()
        proc = subprocess.run(["bcftools", "view", sample_file, "-Oz", "-o", out_file, "-R", reg_file])
        result = proc.returncode
        exit += result

if exit == 0:
    Path("splits/sample_check/"+sample_id+"_done.txt").touch() #creates a file for snakemake to track the changes if everything went fine
    sys.exit(0)
else:
    sys.exit(1)

当我手动运行这个python脚本时，命令是：

python3 GVCF_split.py "sample_id"，它运行得很顺利。但是当我把这个snakemake文件提交到集群时，使用了--profile选项，规则按预期每个样本都被提交了，但一开始运行就立刻失败了。之后，snakemake文件还在继续运行，但没有抛出任何错误。这里是我使用--profile选项的配置文件：

cluster: mkdir -p slurm_snake/`basename {workflow.main_snakefile}`/{rule} &&
  sbatch
  --partition={resources.partition}
  --cpus-per-task={resources.threads}
  --mem={resources.mem}
  --time={resources.time}
  --job-name=smk-{rule}-{wildcards}
  --output=try/slurm_snake/`basename {workflow.main_snakefile}`/{rule}/{rule}-{wildcards}-%j.out
default-resources:
  - partition=main
  - mem='4G'
  - time="24:0:0"
  - threads=1
restart-times: 0
max-jobs-per-second: 5
max-status-checks-per-second: 1
local-cores: 1
latency-wait: 60
jobs: 1000
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy

我对原来的snakemake文件也有类似的设置（这是它的一个副本，我想在对原文件做更改之前试几件事），在原文件中，每个规则的每次提交的slurm文件都保存在slurm_snake文件夹里。然而，这些规则没有生成slurm文件，这可能是什么原因呢？我在提交到集群时做错了什么？

这是主snakemake集群提交的slurm输出示例：

Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 1000
Job stats:
job          count    min threads    max threads
---------  -------  -------------  -------------
GVCFSplit        5              1              1
all              1              1              1
total            6              1              1

Select jobs to execute...

[Thu Mar 14 14:47:00 2024]
rule GVCFSplit:
    input: gvcf/12_19264_20
    output: splits/sample_check/12_19264_20_done.txt
    log: logs/GVCFSplit/12_19264_20_done.log
    jobid: 5
    benchmark: benchmarks/GVCFSplit/12_19264_20_done.benchmark.txt
    reason: Missing output files: splits/sample_check/12_19264_20_done.txt
    wildcards: SAMPLES=12_19264_20
    resources: mem_mb=1000, disk_mb=1000, tmpdir=/tmp, partition=main, mem=1g, time=4:00:00, threads=1


            python3 /path/to/script/GVCF_split.py 12_19264_20

当我从终端手动运行这个python脚本时，它没有任何错误。

错误处理脚本执行工作流管理 snakemake 集群提交 slurm 规则失败文件分割

提交的Snakemake规则开始运行并立即失败，但Snakemake继续运行没有抛出任何错误

2 个回答

撰写回答