<p>因此,这种方法非常粗糙。如果您的线条尺寸大致相同,且标准偏差较小,则效果良好。我们的想法是将文件的一部分读入一个缓冲区,该缓冲区足够小,可以节省内存,但足够大,这样两端的书写形式就不会把事情弄糟(因为行的大小大致相同,差异很小,我们可以交叉手指祈祷它能工作)。我们基本上会跟踪我们在文件中的位置并来回跳转。我使用<code>collections.deque</code>作为缓冲区,因为它从两端都具有良好的<code>append</code>性能,并且我们可以利用队列的FIFO特性:</p>
<pre><code>from collections import deque
def efficient_dropfirst(f, dropfirst=1, buffersize=3):
f.seek(0)
buffer = deque()
tail_pos = 0
# these next two loops assume the file has many thousands of
# lines so we can safely drop and buffer the first few...
for _ in range(dropfirst):
f.readline()
for _ in range(buffersize):
buffer.append(f.readline())
line = f.readline()
while line:
buffer.append(line)
head_pos = f.tell()
f.seek(tail_pos)
tail_pos += f.write(buffer.popleft())
f.seek(head_pos)
line = f.readline()
f.seek(tail_pos)
# finally, clear out the buffer:
while buffer:
f.write(buffer.popleft())
f.truncate()
</code></pre>
<p>现在,让我们用一个运行良好的假装文件来尝试这一点:</p>
<pre><code>>>> s = """1. the quick
... 2. brown fox
... 3. jumped over
... 4. the lazy
... 5. black dog.
... 6. Old McDonald's
... 7. Had a farm
... 8. Eeyi Eeeyi Oh
... 9. And on this farm they had a
... 10. duck
... 11. eeeieeeiOH
... """
</code></pre>
<p>最后:</p>
<pre><code>>>> import io
>>> with io.StringIO(s) as f: # we mock a file
... efficient_dropfirst(f)
... final = f.getvalue()
...
>>> print(final)
2. brown fox
3. jumped over
4. the lazy
5. black dog.
6. Old McDonald's
7. Had a farm
8. Eeyi Eeeyi Oh
9. And on this farm they had a
10. duck
11. eeeieeeiOH
</code></pre>
<p>如果<code>dropfirst</code><<code>buffersize</code>有点“松懈”。因为您只想删除第一行,所以只需保留<code>dropfirst=1</code>,您可以制作<code>buffersize=100</code>或是为了安全起见。它将比阅读“成千上万行”更节省内存,如果没有一行比前几行大,那么您应该是安全的。但请注意,这是非常粗糙的边缘</p>