在Python中使用grep

8 投票

4 回答

93412 浏览

提问于 2025-04-17 11:08

有一个文件叫做 query.txt，里面有一些关键词或短语，这些关键词需要用 grep 命令去和其他文件进行匹配。下面代码的最后三行运行得很好，但是当在 while 循环里面使用同样的命令时，它就会进入一个无限循环，或者说不再响应了。

import os

f=open('query.txt','r')
b=f.readline()
while b:
    cmd='grep %s my2.txt'%b    #my2 is the file in which we are looking for b
    os.system(cmd)
    b=f.readline()
f.close()

a='He is'
cmd='grep %s my2.txt'%a
os.system(cmd)

命令行工具无限循环文件匹配 grep

4 个回答

其实这样用Python并不是个好主意，但如果你真的需要这么做，那就要正确地去做：

from __future__ import with_statement
import subprocess

def grep_lines(filename, query_filename):
    with open(query_filename, "rb") as myfile:
        for line in myfile:
             subprocess.call(["/bin/grep", line.strip(), filename])

grep_lines("my2.txt", "query.txt")

希望你的文件里没有任何在正则表达式中有特殊含义的字符哦 =)

另外，你也可以仅仅用 grep 来做到这一点：

grep -f query.txt my2.txt

它的工作原理是这样的：

~ $ cat my2.txt 
One two
two two
two three
~ $ cat query.txt 
two two
three
~ $ python bar.py 
two two
two three

回答于 2025-04-17 由 Python大师

分享举报

你的代码会对每一个在 query.txt 文件里的查询，去扫描整个 my2.txt 文件。

你想要做的是：

把所有的查询读到一个列表里
只遍历一次 my2.txt 文件的所有行，然后检查每一行是否符合所有的查询。

可以试试这个代码：

with open('query.txt','r') as f:
    queries = [l.strip() for l in f]

with open('my2.txt','r') as f:
    for line in f:
        for query in queries:
            if query in line:
                print query, line

回答于 2025-04-17 由 Python大师

分享举报

首先，你没有正确地遍历文件。你可以直接用 for b in f: 来代替那些 .readline() 的东西。

然后，如果文件名里有一些在命令行中有特殊含义的字符，你的代码就会出问题。建议用 subprocess.call 来替代 os.system()，并传递一个参数列表。

这里有一个修正后的版本：

import os
import subprocess
with open('query.txt', 'r') as f:
    for line in f:
        line = line.rstrip() # remove trailing whitespace such as '\n'
        subprocess.call(['/bin/grep', line, 'my2.txt'])

不过，你可以进一步改进你的代码，完全不调用 grep。可以把 my2.txt 的内容读到一个字符串里，然后用 re 模块来进行搜索。如果你根本不需要正则表达式，甚至可以直接用 if line in my2_content 来判断。

回答于 2025-04-17 由 Python大师

分享举报

在Python中使用grep

4 个回答

撰写回答