我有一个file.txt
看起来像这样(为了简化我的示例,我删除了几行):
PLXNA3 ### <- filename1
Missense/nonsense : 13 mutations # <- header spaces
accession codon_change amino_acid_change # <- column names tsv
ID73 CAT-TAT His66Tyr # <- line tsv
ID63 GAC-AAC Asp127Asn # <- line tsv
ID31 GCC-GTC Ala307Val # <- line tsv
NEDD4L ### <- filename2
Splicing : 1 mutation # <- header spaces
accession splicing_mutation # <- column names tsv
ID51 IVS1 as G-A -16229 # <- line tsv
Gross deletions : 1 mutation # <- header spaces
accession DNA_level description HGVS_(nucleotide) HGVS_(protein) # <- column names tsv
ID853 gDNA 4.5 Mb incl. entire gene Not yet available Not yet available # <- line tsv
OPHN1 ### <- filename3
Small insertions : 3 mutations # <- header spaces
accession insertion HGVS_(nucleotide) # <- column names tsv
ID96 TTATGTT(^183)TATtCAAATCCAGG c.549dupT p.(Gln184Serfs*23) # <- line tsv
ID25 GTGCT(^310)AAGCAcaG_EI_GTCAGTTCT c.931_932dupCA # <- line tsv
我想拆分此文件以获得3个不同的文件:
PLXNA3.txt
PLXNA3 ### <- filename1
Missense/nonsense : 13 mutations # <- header spaces
accession codon_change amino_acid_change # <- column names tsv
ID73 CAT-TAT His66Tyr # <- line tsv
ID63 GAC-AAC Asp127Asn # <- line tsv
ID31 GCC-GTC Ala307Val # <- line tsv
NEDD4L.txt
NEDD4L ### <- filename2
Splicing : 1 mutation # <- header spaces
accession splicing_mutation # <- column names tsv
ID51 IVS1 as G-A -16229 # <- line tsv
Gross deletions : 1 mutation # <- header spaces
accession DNA_level description HGVS_(nucleotide) HGVS_(protein) # <- column names tsv
ID853 gDNA 4.5 Mb incl. entire gene Not yet available Not yet available # <- line tsv
OPHN1
OPHN1 ### <- filename3
Small insertions : 3 mutations # <- header spaces
accession insertion HGVS_(nucleotide) # <- column names tsv
ID96 TTATGTT(^183)TATtCAAATCCAGG c.549dupT p.(Gln184Serfs*23) # <- line tsv
ID25 GTGCT(^310)AAGCAcaG_EI_GTCAGTTCT c.931_932dupCA # <- line tsv
如何使用诸如awk
或python
之类的linux命令实现所需的输出
注意:
-
李>提前谢谢
一个同等但更高傲的选择是
这是我想出的解决办法。它首先打开要拆分的文件。然后读取第一行,这是第一个文件的文件名。现在让我跳过while循环。它将打开一个新文件,文件名为刚才读入的文件名(需要strip()来删除行尾的新行字符)。然后读入行并将其写入新文件,直到出现一个没有空间或制表符的文件为止。然后重复这个过程,直到文件没有更多的行可读(我之前跳过的while循环)
希望有帮助:)
相关问题 更多 >
编程相关推荐