有办法在使用seek和next()读取文件时回退吗？

6 投票

2 回答

5523 浏览

提问于 2025-04-18 00:09

我正在写一个Python脚本来读取一个文件，当我到达文件的某个部分时，读取那部分的最后方式取决于该部分中提供的信息。所以我发现这里可以使用类似的东西

fp = open('myfile')
last_pos = fp.tell()
line = fp.readline()
while line != '':
  if line == 'SPECIAL':
  fp.seek(last_pos)
  other_function(fp)
  break
last_pos = fp.tell()
line = fp.readline()

但是我现在代码的结构大致是这样的：

fh = open(filename)

# get generator function and attach None at the end to stop iteration
items = itertools.chain(((lino,line) for lino, line in enumerate(fh, start=1)), (None,))
item = True

  lino, line = next(items)

  # handle special section
  if line.startswith['SPECIAL']:

    start = fh.tell()

    for i in range(specialLines):
      lino, eline = next(items)
      # etc. get the special data I need here

    # try to set the pointer to start to reread the special section  
    fh.seek(start)

    # then reread the special section

不过，这种方法会出现以下错误：

由于调用了next()，所以无法获取当前位置

有没有办法避免这个问题呢？

数据流错误处理文件操作编程技巧文件读取文件指针 seek next

2 个回答

我对Python 3不是很专业，但看起来你是在用一种叫generator的东西，它会从文件中读取一行一行的数据。所以你只能单向读取。

你需要换个方法来解决这个问题。

回答于 2025-04-18 由 Python大师

分享举报

当你把文件当作一个迭代器使用时，比如调用 next() 或者在 for 循环中使用它，实际上会用到一个内部缓冲区；这意味着文件的实际读取位置会在文件的更后面，使用 .tell() 方法并不能告诉你下一个要读取的行的位置。

如果你需要在文件中来回跳转，解决办法是不要直接在文件对象上使用 next()，而是只用 file.readline()。你仍然可以使用迭代器，使用 iter() 的两个参数版本：

fileobj = open(filename)
fh = iter(fileobj.readline, '')

调用 next() 在 fileiterator() 上会一直调用 fileobj.readline()，直到这个函数返回一个空字符串。这样就创建了一个不使用内部缓冲区的文件迭代器。

示例：

>>> fh = open('example.txt')
>>> fhiter = iter(fh.readline, '')
>>> next(fhiter)
'foo spam eggs\n'
>>> fh.tell()
14
>>> fh.seek(0)
0
>>> next(fhiter)
'foo spam eggs\n'

注意，你的 enumerate 链可以简化为：

items = itertools.chain(enumerate(fh, start=1), (None,))

不过我不太明白你为什么觉得这里需要一个 (None,) 的哨兵值；StopIteration 仍然会被抛出，只不过是在多调用一次 next() 后。

要读取 specialLines 的行数，可以使用 itertools.islice()：

for lino, eline in islice(items, specialLines):
    # etc. get the special data I need here

你也可以直接循环遍历 fh，而不是使用无限循环和 next() 调用：

with open(filename) as fh:
    enumerated = enumerate(iter(fileobj.readline, ''), start=1):
    for lino, line in enumerated:
        # handle special section
        if line.startswith['SPECIAL']:
            start = fh.tell()

            for lino, eline in islice(items, specialLines):
                # etc. get the special data I need here

            fh.seek(start)

但请注意，即使你回退，行号仍然会增加！

不过，你可能想要重构你的代码，以避免重新读取文件的某些部分。

回答于 2025-04-18 由 Python大师

分享举报

有办法在使用seek和next()读取文件时回退吗？

2 个回答

撰写回答