用python填充CSV表:可伸缩性

2024-04-23 06:06:36 发布

您现在位置:Python中文网/ 问答频道 /正文

我是一个完全的初学者。。。所以告诉我如果我做了什么蠢事!我来自医学领域,从来没有学过编程。。。你知道吗

我有两个csv表:一个表的特点,每个案件(表格.csv):仅1表示存在,0表示不存在。。。你知道吗

Characteristics\Patients, Jan, Piet, John, Frederic, Taisha
Jaundice, 1, 1, 0, 0, 0
Fever, 1, 0, 0, 1, 0
Tachycardia, 1, 0, 0, 1, 1
Familyhistory, 0, 0, 1, 1, 1

我做了另一个文件AccumTable.csv文件地址:

Characteristics\Characteristics, Jaundice, Fever, Tachycardia, Familyhistory
Jaundice, 0, 0, 0, 0
Fever, 0, 0, 0, 0
Tachycardia, 0, 0, 0, 0
Familyhistory, 0, 0, 0, 0

所以,我想做的是:我想填充AccumTable.csv文件每个病人同时出现的症状。看看是不是有些事情更容易发生在一起。你知道吗

所以我写了这个剧本:

with open('TableBLCA.csv', "rb") as tablein:
        tablereader = csv.DictReader(tablein, delimiter = '\t', lineterminator='\n')
        counter = 1 
        for row in tablereader:
            counter = counter + 1
            #print counter  
            #iterrow = row.iteritems()
            #print iterrow
            symp = row["Characteristics\Patients"]
            #print row.iteritems()
            for pat, presence in row.iteritems():
                if presence == '1':
                    for row in tablereader:
                        if row[pat] == '1':
                            assocsymp = row["Characteristics\Patients"]
                            print symp
                            print assocsymp
                            with open('AccumTable.csv', "rb") as accumtablein, open('AccumTableintermediate.csv', "wb") as accumtableout:
                                accumtablereader = csv.DictReader(accumtablein, delimiter = '\t', lineterminator='\n')
                                fieldname = accumtablereader.fieldnames
                                accumtablewriter = csv.DictWriter(accumtableout, delimiter = '\t', lineterminator='\n', fieldnames = fieldname)
                                accumtablewriter.writeheader()
                                for row in accumtablereader:
                                    if row["Characteristics\Characteristcs"] == symp:
                                        updatedrow = row
                                        number = updatedrow[assocsymp]
                                        number = int(number) +1
                                        updatedrow.update({assocsymp:number})
                                        accumtablewriter.writerow(updatedrow)
                                    else:
                                        accumtablewriter.writerow(row)
                            accumtablein.close()
                            accumtableout.close()
                            shutil.copyfile('AccumTableintermediate.csv', 'AccumTable.csv') 
                    tablein.seek(0)
                    readercount = 0
                    while  (readercount < counter):
                        next(tablereader)
                        readercount = readercount + 1


    tablein.close()

这很管用。问题是可伸缩性。。。我总是要写文件,关闭它,更改文件的名称和这个新命名的文件被再次读取。。。所以它可以积累。。。有没有更快的方法?我的第一个表可以有超过50000行和9000列。。。我的第二个文件是50000x50000 csv文件。你知道吗

非常感谢你的想法!你知道吗

迈克尔


Tags: 文件csvinnumberforcounterrowprint