将文件指针倒带到上一个lin的开头问题的回答

将文件指针倒带到上一个lin的开头

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我正在进行文本处理并使用“readline（）”函数，如下所示： <pre><code>ifd = open(...) for line in ifd: while (condition) do something... line = ifd.readline() condition = .... </code></pre> 当条件变为false时，我需要倒带指针，以便“for”循环再次读取同一行。在 在ifd.fseek公司（）后面跟readline是给我一个'\n'字符。如何倒带指针，以便重新读取整行。在 ^{pr2}$ <h3>这是我的代码</h3> <pre><code>labtestnames = sorted(tmp) #Now read each line in the inFile and write into outFile ifd = open(inFile, "r") ofd = open(outFile, "w") #read the header header = ifd.readline() #Do nothing with this line. Skip #Write header into the output file nl = "mrn\tspecimen_id\tlab_number\tlogin_dt\tfluid" offset = len(nl.split("\t")) nl = nl + "\t" + "\t".join(labtestnames) ofd.write(nl+"\n") lenFields = len(nl.split("\t")) print "Reading the input file and converting into modified file for further processing (correlation analysis etc..)" prevTup = (0,0,0) rowComplete = 0 k=0 for line in ifd: k=k+1 if (k==200): break items = line.rstrip("\n").split("\t") if((items[0] =='')): continue newline= list('' for i in range(lenFields)) newline[0],newline[1],newline[3],newline[2],newline[4] = items[0], items[1], items[3], items[2], items[4] ltests = [] ltvals = [] while(cmp(prevTup, (items[0], items[1], items[3])) == 0): # If the same mrn, lab_number and specimen_id then fill the same row. else create a new row. ltests.<a href="https://www.cnpython.com/list/append" class="inner-link">append</a>(items[6]) ltvals.append(items[7]) pos = ifd.tell() line = ifd.readline() prevTup = (items[0], items[1], items[3]) items = line.rstrip("\n").split("\t") rowComplete = 1 if (rowComplete == 1): #If the row is completed, prepare newline and write into outfile indices = [labtestnames.index(x) for x in ltests] j=0 ifd.seek(pos) for i in indices: newline[i+offset] = ltvals[j] j=j+1 if (rowComplete == 0): # currTup = (items[0], items[1], items[3]) ltests = items[6] ltvals = items[7] pos = ifd.tell() line = ifd.readline() items = line.rstrip("\n").split("\t") newTup = (items[0], items[1], items[3]) if(cmp(currTup, newTup) == 0): prevTup = currTup ifd.seek(pos) continue else: indices = labtestnames.index(ltests) newline[indices+offset] = ltvals ofd.write(newline+"\n") </code></pre>

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

使用<a href="http://docs.python.org/2/library/itertools.html#itertools.groupby" rel="nofollow">itertools.groupby</a>可以更简单地处理这个问题。<code>groupby</code>可以对处理相同mrn、样本号和lab num的所有连续行进行聚类 执行此操作的代码是 <pre><code>for key, group in IT.groupby(reader, key = mykey): </code></pre> 其中<code>reader</code>迭代输入文件的行，<code>mykey</code>由 ^{pr2}$ 来自<code>reader</code>的每一行都被传递给<code>mykey</code>，所有具有相同键的行都聚集在同一个<code>group</code>中。在 <hr/> 在这一过程中，我们不妨使用<a href="http://docs.python.org/2/library/csv.html" rel="nofollow">csv module</a>将每一行读入dict（我称之为<code>row</code>）。这使我们不必处理诸如<code>line.rstrip("\n").split("\t")</code>这样的低级字符串操作，而不是通过索引号（例如<code>row[3]</code>）来引用列，我们可以编写用更高级的术语（如<code>row['lab_num']</code>）来表示的代码。在 <hr/> <pre><code>import itertools as IT import csv inFile = 'curious.dat' outFile = 'curious.out' def mykey(row): return (row['mrn'], row['specimen_id'], row['lab_num']) fieldnames = 'mrn specimen_id date lab_num Bilirubin Lipase Calcium Magnesium Phosphate'.split() with open(inFile, 'rb') as ifd: reader = csv.DictReader(ifd, delimiter = '\t') with open(outFile, 'wb') as ofd: writer = csv.DictWriter( ofd, fieldnames, delimiter = '\t', lineterminator = '\n', ) writer.writeheader() for key, group in IT.groupby(reader, key = mykey): new = {} row = next(group) for key in ('mrn', 'specimen_id', 'date', 'lab_num'): new[key] = row[key] new[row['labtest']] = row['result_val'] for row in group: new[row['labtest']] = row['result_val'] writer.writerow(new) </code></pre> 收益率 <pre><code>mrn specimen_id date lab_num Bilirubin Lipase Calcium Magnesium Phosphate 4419529 1614487 26.2675 5802791G 0.1 3319529 1614487 26.2675 5802791G 0.3 153 8.1 2.1 4 5713871 682571 56.0779 9732266E 4.1 </code></pre>

将文件指针倒带到上一个lin的开头

1 个回答

相关Python问题