如何在文件中找到单词的位置?

2024-04-26 22:57:40 发布

您现在位置:Python中文网/ 问答频道 /正文


Tags: python
3条回答

试试这个:

with open(file_dmp_path, 'rb') as file:
fsize = bsize = os.path.getsize(file_dmp_path)
word_len = len(SEARCH_WORD)
while True:
    p = file.read(bsize).find(SEARCH_WORD)
    if p > -1:
        pos_dec = file.tell() - (bsize - p)
        file.seek(pos_dec + word_len)
        bsize = fsize - file.tell()
    if file.tell() < fsize:
        seek = file.tell() - word_len + 1
        file.seek(seek)
    else:
        break

除非打开文件,否则无法在文件中找到文本的位置。这就像让别人看报纸而不睁开眼睛。

回答你问题的第一部分,相对来说比较简单。

with open('Path/to/file', 'r') as f:
    content = f.read()
    print content.index('test')

您可以使用memory-mapped filesregular expressions

Memory-mapped file objects behave like both strings and like file objects. Unlike normal string objects, however, these are mutable. You can use mmap objects in most places where strings are expected; for example, you can use the re module to search through a memory-mapped file. Since they’re mutable, you can change a single character by doing obj[index] = 'a', or change a substring by assigning to a slice: obj[i1:i2] = '...'. You can also read and write data starting at the current file position, and seek() through the file to different positions.

示例

import re
import mmap

f = open('path/filename', 'r+b')
mf = mmap.mmap(f.fileno(), 0)
mf.seek(0) # reset file cursor
m = re.search('pattern', mf)
print m.start(), m.end()
mf.close()
f.close()

相关问题 更多 >