我正在尝试使用Python从一个存档中提取一些信息。此档案的一部分是:
1. [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array
(Submitter supplied) Affymetrix submissions are replicated on the GeneChip Human Genome U133 Plus 2.0 Array. more...
Organism: Homo sapiens
527 DataSets 4123 Series 54 Related Platforms 115874 Samples
FTP download: GEO ftp://ftp.ncbi.nlm.nih.gov/geo/platforms/GPLnnn/GPL570/
Platform Accession: GPL570 ID: 100000570
2. [Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array
(Submitter supplied) Affymetrix submissions are typically Array. more...
Organism: Mus musculus
517 DataSets 3529 Series 36 Related Platforms 46528 Samples
FTP download: GEO ftp://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL1nnn/GPL1261/
Platform Accession: GPL1261 ID: 100001261
import re
import sys
import itertools
stdout = open("results.txt", "w")
pattern = re.compile(r'^\d+[.]\s')
pattern2 = re.compile(r'Organism:')
pattern3 = re.compile(r'FTP download:')
pattern4 = re.compile(r'ID: ')
listOrg = []
def group_separator(line):
return line=='ID: '
with open('Microarray/PlatformsMicroarray.txt') as f:
for key,group in itertools.groupby(f,group_separator):
# print(key,list(group)) # uncomment to see what itertools.groupby does.
if not key:
data={}
for item in group:
for line in f:
if pattern.search(line):
listOrg.append(line)
if pattern2.search(line):
#field,value=line.split(':')
listOrg.append(line)
if pattern3.search(line):
listOrg.append(line)
if pattern4.search(line):
listOrg.append(line)
for item in listOrg:
stdout.write("%s" % item)
stdout.close()
如何连接这些信息以便在.csv中写入存档文件?你知道吗
csv
模块是您选择的武器。你知道吗在本例中,
listOrg
似乎是经过解析的输入,所以您可以这样做相关问题 更多 >
编程相关推荐