python中的grep-r

2024-04-20 05:50:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我想在python函数中实现unix命令“grep-r”。我知道commands.getstatusoutput(),但现在我不想使用它。我想到了这个:

def grep_r (str, dir):
    files = [ o[0]+"/"+f for o in os.walk(dir) for f in o[2] if os.path.isfile(o[0]+"/"+f) ]
    return [ l for f in files for l in open(f) if str in l ]

但这当然不使用正则表达式,它只是检查“str”是否是“l”的子字符串。所以我尝试了以下几点:

def grep_r (pattern, dir):
    r = re.compile(pattern)
    files = [ o[0]+"/"+f for o in os.walk(dir) for f in o[2] if os.path.isfile(o[0]+"/"+f) ]
    return [ l for f in files for l in open(f) if r.match(l) ]

但这不起作用,即使在前一个函数起作用的地方,它也不会给我任何匹配。什么改变了?我可以把它分成一堆嵌套循环,但我更感兴趣的是简洁而不是可读性。


Tags: path函数inforreturnifosdef
3条回答

您可能希望search()而不是match()来捕获行中间的匹配项,如http://docs.python.org/library/re.html#matching-vs-searching中所述

而且,代码的结构和意图是相当隐蔽的。我把它化脓了。

def grep_r (pattern, dir):
    r = re.compile(pattern)
    for parent, dnames, fnames in os.walk(dir):
        for fname in fnames:
            filename = os.path.join(parent, fname)
            if os.path.isfile(filename):
                with open(filename) as f:
                    for line in f:
                        if r.search(line):
                            yield line

re.match只检查字符串的开头。

使用重新搜索()

来自the docs

Python offers two different primitive operations based on regular expressions: match checks for a match only at the beginning of the string, while search checks for a match anywhere in the string (this is what Perl does by default).

将所有这些代码放入一个名为pygrep和chmod+x pygrep的文件中:

#!/usr/bin/python

import os
import re
import sys

def file_match(fname, pat):
    try:
        f = open(fname, "rt")
    except IOError:
        return
    for i, line in enumerate(f):
        if pat.search(line):
            print "%s: %i: %s" % (fname, i+1, line)
    f.close()


def grep(dir_name, s_pat):
    pat = re.compile(s_pat)
    for dirpath, dirnames, filenames in os.walk(dir_name):
        for fname in filenames:
            fullname = os.path.join(dirpath, fname)
            file_match(fullname, pat)

if len(sys.argv) != 3:
    u = "Usage: pygrep <dir_name> <pattern>\n"
    sys.stderr.write(u)
    sys.exit(1)

grep(sys.argv[1], sys.argv[2])

相关问题 更多 >