Python中并行读取和过滤文件问题的回答

Python中并行读取和过滤文件

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

<p>问题是，你重复672343*795516=534'859'613'988次，这太多了。你需要一个更聪明的解决方案。在</p> <p>所以我们发现问题是我们看了太多的数据，我们需要改变这一点。一个方法就是试着变得聪明。也许创建一个字典，其中的键对应于<code>chr</code>，所以我们只需要检查这些条目。但是我们还没有处理<code>start</code>和{<cd3>}。也许也有一个聪明的方法</p> <p>这看起来很像数据库。所以如果它是一个数据库，也许我们应该把它当作一个数据库。Python附带了sqlite3。在</p> <p>这里有一个解决方案，但还有无数的其他可能性。在</p> <pre><code>import sqlite3 import csv # create an in-memory database conn = sqlite3.connect(":memory:") # create the tables c = conn.cursor() c.execute("""CREATE TABLE t1 ( chr TEXT, type TEXT, name TEXT, start INTEGER, end INTEGER );""") # if you only have a few columns, just name them all, # if you have a lot, maybe just put everything in one # column as a string c.execute("""CREATE TABLE t2 ( chr TEXT, num INTEGER, col3, col4 );""") # create indices on the columns we use for selecting c.execute("""CREATE INDEX i1 ON t1 (chr, start, end);""") c.execute("""CREATE INDEX i2 ON t2 (chr, num);""") # fill the tables with open("comparison_file.csv", 'rb') as f: reader = csv.reader(f) # sqlite takes care of converting the number-strings to numbers c.executemany("INSERT INTO t1 VALUES (?, ?, ?, ?, ?)", reader) with open("input.csv", 'rb') as f: reader = csv.reader(f) # sqlite takes care of converting the number-strings to numbers c.executemany("INSERT INTO t2 VALUES (?, ?, ?, ?)", reader) # now let sqlite do its magic and select the correct lines c.execute("""SELECT t2.*, t1.* FROM t1 JOIN t2 ON t1.chr == t2.chr WHERE t2.num BETWEEN t1.start AND t1.end;""") # write result to disk with open("output.csv", "wb") as f: writer = csv.writer(f) for row in c: writer.writerow(row) </code></pre> <h2>Python编码技巧</h2> <p>下面是我如何编写您的原始代码。在</p> ^{pr2}$ <h3>备注1</h3> <pre><code>line = line[0:len(line) - 1] </code></pre> <p>可以写成</p> ^{4}$ <h3>备注2</h3> <p>而不是</p> <pre><code>my_list = [1,2,3] for i in xrange(len(my_list)): # do something with my_list[i] </code></pre> <p>您应该：</p> <pre><code>my_list = [1,2,3] for item in my_list: # do something with item </code></pre> <p>如果需要索引，请将其与<code>enumerate()</code>合并。在</p>

Python中并行读取和过滤文件

1 个回答

相关Python问题