错误地重写文件

chrom start stop strand isoform mu_codon mut_codon2 more_info chr22 43089055 43089055 - NM_017436 C 903delC chr22 43089715 43089717 - NM_017436 CTT 241_243delTTC chr22 43089657 43089657 - NM_017436 G 301delG chr12 53701873 53701875 - NM_015665 TTC A 1292_1294delTTCinsA

import csv OutputFileName = "indels_mut_count2.txt" OutputFile = open(OutputFileName, 'w') with open("indels_mut_removed.txt") as f: for line in f: columns = line.split('\t') chrom = columns[0] start = columns[1] stop = columns[2] strand = columns[3] isoform = columns[4] codon1 = columns[5] codon2 = columns[6] info = columns[7] length = len(codon1) length2 = len(codon2) OutputFile.write(''+chrom+'\t'+str(start)+'\t'+str(stop)+'\t'+strand+'\t'+isoform+'\t'+codon1+'\t'+codon2+'\t'+str(length)+'\t'+str(length2)+'\t'+info+'\n')

chrom start stop strand isoform mu_codon mut_codon2 8 10 more_info chr22 43089055 43089055 - NM_017436 C 1 1 903delC chr22 43089715 43089717 - NM_017436 CTT 3 1 241_243delTTC chr22 43089657 43089657 - NM_017436 G 1 1 301delG

3条回答

网友

1楼 · 编辑于 2024-06-09 19:19:06

csv（正如Arkanosis所建议的那样）是一个很好的选择；否则：

with open("indels_mut_removed.txt") as f:
    for line in f:
        line = line.strip() # removes trainling '\n'
        columns = line.split('\t')

网友

2楼 · 编辑于 2024-06-09 19:19:06

要使用csv，可以执行以下类似操作：

import csv

with open(fn) as fin, open(fo, 'w') as fout:
    reader=csv.reader(fin, delimiter='\t')
    writer=csv.writer(fout, delimiter='\t')
    headers_in=next(reader)
    headers_out=headers_in[:-1]+['len codon 1', 'len codon 2']+headers_in[-1:]
    writer.writerow(headers_out)
    for row_in in reader:
        row_out=row_in[:-1]+map(len, [row_in[5], row_in[6]])+row_in[-1:]
        writer.writerow(row_out)

鉴于此输入：

chrom   start   stop    strand  isoform mu_codon    mut_codon2  more_info
chr22   43089055    43089055    -   NM_017436   C   903delC
chr22   43089715    43089717    -   NM_017436   CTT 241_243delTTC
chr22   43089657    43089657    -   NM_017436   G   301delG
chr12   53701873    53701875    -   NM_015665   TTC A   1292_1294delTTCinsA

生成此输出：

chrom   start   stop    strand  isoform mu_codon    mut_codon2  len codon 1 len codon 2 more_info
chr22   43089055    43089055    -   NM_017436   C   1   7   903delC
chr22   43089715    43089717    -   NM_017436   CTT 3   13  241_243delTTC
chr22   43089657    43089657    -   NM_017436   G   1   7   301delG
chr12   53701873    53701875    -   NM_015665   TTC A   5   19  1292_1294delTTCinsA

网友

3楼 · 编辑于 2024-06-09 19:19:06

我建议您使用csv.csvreader而不是手动拆分行。拆分不处理转义和其他一些事情，因此csv.csvreader更安全

作为参考，您遇到这个问题是因为您没有剥离每行末尾的\n，例如使用rstrip()。然后将其写入输出，并添加一个（第二个）\n，在调用write()时将其放入自己的位置

不过，还是用csv.csvreader代替

相关问题更多 >

编程相关推荐

热门问题

热门文章