<p>这可能比你的快。对线的长度没有任何假设。一次返回一个块,直到找到正确数量的'\n'字符。</p>
<pre><code>def tail( f, lines=20 ):
total_lines_wanted = lines
BLOCK_SIZE = 1024
f.seek(0, 2)
block_end_byte = f.tell()
lines_to_go = total_lines_wanted
block_number = -1
blocks = [] # blocks of size BLOCK_SIZE, in reverse order starting
# from the end of the file
while lines_to_go > 0 and block_end_byte > 0:
if (block_end_byte - BLOCK_SIZE > 0):
# read the last block we haven't yet read
f.seek(block_number*BLOCK_SIZE, 2)
blocks.append(f.read(BLOCK_SIZE))
else:
# file too small, start from begining
f.seek(0,0)
# only read what was not read
blocks.append(f.read(block_end_byte))
lines_found = blocks[-1].count('\n')
lines_to_go -= lines_found
block_end_byte -= BLOCK_SIZE
block_number -= 1
all_read_text = ''.join(reversed(blocks))
return '\n'.join(all_read_text.splitlines()[-total_lines_wanted:])
</code></pre>
<p>我不喜欢在实际情况下,当你永远不可能知道这样的事情的时候,对线的长度做一些复杂的假设。</p>
<p>通常,这将在第一次或第二次通过循环时定位最后20行。如果你的74个字符的东西是准确的,你使块大小2048,你将尾随20行几乎立即。</p>
<p>而且,我也不会消耗大量的大脑卡路里来巧妙地调整物理操作系统块。使用这些高级I/O包,我怀疑您会看到试图在OS块边界上对齐的任何性能结果。如果使用较低级别的I/O,则可能会看到加速。</p>