在Python中，如何高效检查字符串是否已在文件中找到？

1 投票

3 回答

2977 浏览

数据工程师

提问于 2025-04-18 11:52

在我写的一个Python函数里，我正在逐行读取一个文本文件，目的是把某个特定的字符串替换成一个数字值。当我读到文件的最后一行时，我想知道这个字符串到底出现过没有。

函数string.replace()并不会告诉我有没有替换成功，所以我不得不两次遍历每一行：第一次查找字符串，第二次进行替换。

到目前为止，我想出了两种方法来解决这个问题。

对于每一行：
- 先用 line.find(...) 查找这个字符串，如果之前还没找到过的话
- 如果找到了，就标记为已找到
- 然后用 newLine = line.replace(...) 进行替换
- (对 newLine 做点什么 ...)
对于每一行：
- 先执行 newLine = line.replace(...) 进行替换
- 如果 newLine != line，就标记这个字符串为已找到
- (对 newLine 做点什么 ...)

我的问题是：有没有更好的方法，也就是更高效或者更符合Python风格的方法？如果没有，以上两种方法中哪种更快呢？

代码优化效率优化文件操作字符串处理数据处理编程技巧逐行读取查找与替换

3 个回答

因为我们无论如何都要遍历这个字符串两次，所以我会这样做：

import re
with open('yourfile.txt', 'r', encoding='utf-8') as f:  # check encoding
    s = f.read()
oldstr, newstr = 'XXX', 'YYY'
count = len(list(re.finditer(oldstr, s)))
s_new = s.replace(oldstr, newstr)
print(oldstr, 'has been found and replaced by', newstr, count, 'times')

回答于 2025-04-18 由 Python大师

分享举报

这个例子可以处理多个替换：

replacements = {'string': [1,0], 'string2': [2,0]}

with open('somefile.txt') as f:
    for line in f:
        for key, value in replacements.iteritems():
            if key in line:
                new_line = line.replace(key, value[0])
                replacements[key][1] += 1

# At the end

for key, value in replacements.iteritems():
    print('Replaced {} with {} {} times'.format(key, *value))

回答于 2025-04-18 由 Python大师

分享举报

我会大概这样做：

found = False
newlines = []

for line in f:
    if oldstring in line:
        found = True
        newlines.append(line.replace(oldstring, newstring))
    else:
        newlines.append(line)

因为这样对我来说最容易理解。

可能还有更快的方法，但最好的方法取决于这个字符串在行中出现的频率。如果几乎每一行都有这个字符串，或者几乎没有，这会有很大的区别。

回答于 2025-04-18 由 Python大师

分享举报

在Python中，如何高效检查字符串是否已在文件中找到？

3 个回答

撰写回答