当文件名在python中是变量时，如何更改它的一部分？

gL=genomeList.txt #Text file containing a list of genomes to loop through. for i in $(cat ${gL}); do #some other stuff ; python ./find_all_ORF_from_getorf.py ${i}_getorf.fsa_aa ; done

import re, sys from Bio import SeqIO from Bio.Seq import Seq from Bio.SeqRecord import SeqRecord infile = sys.argv[1] with open(f'{infile}_all_ORF.fsa_aa'.format(), "a") as file_object: for sequence in SeqIO.parse(infile, "fasta"): #do some stuff print(f'{sequence.description}_ORF_from_position_{h.start()},\n{sequence.seq[h_start:]}', file=file_object)

1条回答

网友

1楼 · 发布于 2024-04-20 09:26:48

关于bash代码，您可能会发现下面的代码片段很有用，我发现它更具可读性，并且在迭代行时经常使用它

while read line; do
    #some other stuff ; 
    python ./find_all_ORF_from_getorf.py ${line}_getorf.fsa_aa ; 
done < genomeList.txt

现在，关于您的问题和python代码

import re, sys 

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord

infile = sys.argv[1]

在这一点上，你的填充将看起来像“基因组文件”\u getorf.fsa\u aa” 一个选项是通过“.”拆分此字符串并获取第一项

name = infile.split('.')[0]

如果您知道文件名中可能有几个“.”，比如“Myfile.out.old”，您只想去掉最后一个扩展名

name = infile.rsplit('.',1)[0]

第三个选项，如果您知道所有文件都以“.fsa_aa”结尾，您可以使用负索引对字符串进行切片。As“.fsa_aa”有7个字符：

name = input[:-7]

这三个选项基于python中字符串处理的字符串方法，详见official python docs

outfile = f'{name}_all_ORF.fsa_aa' 
# if you wrote f'{variable}' you don't need the ".format()"
# On the other hand you can do '{}'.format(variable)
# or even '{variable}'.format(variable=SomeOtherVariable)

with open(outfile, "a") as file_object:
    for sequence in SeqIO.parse(infile, "fasta"):
       #do some stuff
       file_object.write(f'{sequence.description}_ORF_from_position_{h.start()},\n{sequence.seq[h_start:]}')

另一个选择是使用来自pathlib library的路径，我建议您使用这个库。在这种情况下，您必须对代码进行一些其他小更改：

import re, sys
from pathlib import Path # <- Here

from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord

infile = Path(sys.argv[1]) # <- Here
outfile = infile.stem + '_all_ORF.fsa_aa' # <- Here 
# And if you want to use outfile as a path I would suggest instead
# outfile = infile.parent.joinpath(infile.stem)

with open(outfile, "a") as file_object:
    for sequence in SeqIO.parse(infile, "fasta"):
       #do some stuff
       file_object.write(f'{sequence.description}_ORF_from_position_{h.start()},\n{sequence.seq[h_start:]}')

最后，正如您在这两种情况下看到的，我用file_object.write方法替换了print语句，写入文件比打印文件更好

相关问题更多 >

编程相关推荐

热门问题

热门文章