试图模仿Python中的grep函数,但如何使其支持多个标志?

2024-04-20 07:37:21 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图模仿LinuxGrep命令的功能。这就是我目前所拥有的

import re
import os

x = input("grep flag pattern file").replace('"', '') .split()

if ("-n" in x):
    with open(x[len(x)-1]) as myFile:
        for num, line in enumerate(myFile, 1):
            if (x[len(x)-2] in line):
                print ('found at line:', num)

if ("-l" in x):
    for file in os.listdir():
        with open(file) as myFile:
            for line in myFile:
                 if (re.search(x[2], line)):
                    print(file)

if ("-i" in x):
    with open(x[len(x)-1]) as myFile:
         for line in myFile:
            if (re.search(x[len(x)-2],line,re.IGNORECASE)):
                print(line.rstrip("\n"))

if ("-v" in x):
    with open(x[len(x)-1]) as myFile:
         for line in myFile:
            if (x[len(x)-2] not in line):
                print(line.rstrip("\n"))

if ("-x" in x):
   with open(x[len(x)-1]) as myFile:
        for line in myFile:
            if (re.match(x[len(x)-2].replace("_"," "), line)):
                print(line.rstrip("\n"))

if ("-n" not in x and "-l" not in x and "-i" not in x and "-v" not in x and "-x" not in x):
    with open(x[2]) as myFile:
        for line in myFile:
            if (re.search(x[1], line)):
                print(line.rstrip("\n"))

如果我只使用一个标志(例如“-n”),它就可以工作,但是如果我有多个标志(例如“-n”“-I”),它会单独工作

基本上,我想做的是输入grep-I-v“kaneki”unravel.txt

它将输出

Oshiete oshiete yo sono shikumi wo
Boku no naka ni dare ga iru no?
Kowareta kowareta yo kono sekai de
Kimi ga warau nanimo miezu ni

#TokyoGhoul

当我的原始文本文件为:

Oshiete oshiete yo sono shikumi wo
Boku no naka ni dare ga iru no?
Kowareta kowareta yo kono sekai de
Kimi ga warau nanimo miezu ni

I LOVE KEN KANEKI <3

#TokyoGhoul

是否有一个内置函数可以做到这一点?或者你知道我怎么做吗

这些是旗帜:

n = prints number line of each matching
l = prints name of text files that has the pattern 
i = case-insensitive comparison
v = prints the lines that doesn't have the pattern 
x = prints entire line that match

Tags: andinreforlenifaswith
2条回答

从注释中,听起来您可以使用一个示例来说明如何将argparse应用于代码。这将实现-i-n选项,允许独立指定它们。这应该足以让你开始

import argparse
import re

parser = argparse.ArgumentParser()

# basic minimum
parser.add_argument("-n", action="store_true")

# give this one a long name and a help string
parser.add_argument("-i", " ignore-case", 
                    action="store_true", help="case insensitive")

parser.add_argument("pattern")
parser.add_argument("filename")

x = input("grep flag pattern file ").replace('"', '').split()
args = parser.parse_args(x)

if args.ignore_case:
    flags = re.IGNORECASE
else:
    flags = 0

with open(args.filename) as myFile:
    for num, line in enumerate(myFile):
        if re.search(args.pattern, line, flags):
            if args.n:
                print("found at line ", num)
            else:
                print(line.rstrip("\n"))

使用parse_args通常的方式是在脚本中使用命令行选项。如果你更换

x = input("grep flag pattern file ").replace('"', '').split()
args = parser.parse_args(x)

args = parser.parse_args(x)

然后,您可以使用以下命令运行脚本,而不是提示输入:

python myscript.py -i mypattern myfile

你也可以这样做:

python myscript.py  help

要获取帮助消息,例如:

usage: myscript.py [-h] [-n] [-i] pattern filename

positional arguments:
  pattern
  filename

optional arguments:
  -h,  help         show this help message and exit
  -n
  -i,  ignore-case  case insensitive    <=== help string you put in your code

请注意,将所有参数解析代码放入一个函数通常是最整洁的,该函数设置解析器并返回参数字典。回到您的初始示例,使用参数列表x,可能如下所示:

def parse_my_args(x):
    parser = ......
    parser.add_argument(.....)
    ... etc ...
    return parser.parse_args(x)

x = .......
args = parse_my_args(x)

您正在对每个标志进行完整打印。我认为你需要重新考虑一下策略。我在这里写了一个简化版本(有一些自由,因为我没有你的文件)

基本上,我把它分成不同的逻辑部分,一个设置所有内容的阶段,然后是测试所有线路的阶段。我希望它能给你一些思考的想法

import re
import os

file_lines = [
  'Oshiete oshiete yo sono shikumi wo',
  'Boku no naka ni dare ga iru no?',
  'Kowareta kowareta yo kono sekai de',
  'Kimi ga warau nanimo miezu ni',
  'I LOVE KEN KANEKI <3',
]


x = input("grep flag pattern file: ").replace('"', '') .split()

flags = [word for word in x if word[0] == '-']
subject = x[len(x)-2]
filename = x[len(x)-1]
show_line_numbers = False
line_by_line_checks = []

# check for case insenstive
# force everything to be lower case
if '-i' in flags:
    subject = subject.lower()
    file_lines = [l.lower() for l in file_lines]

if '-n' in flags:
  show_line_numbers = True

# line by line checks
# create more line by line checks if you want
if '-v' in flags:
  line_by_line_checks.append(lambda a : subject not in a)
else:
  line_by_line_checks.append(lambda a : subject in a)

# loop through the lines and see what passes
lines_to_return = {}
for i in range(0, len(file_lines)):
  line = file_lines[i]
  line_passes = False
  for func in line_by_line_checks:
    if func(line):
      line_passes = True
  if line_passes:
    lines_to_return[i] = line

# now you have a dictionary
# the key is the line number
# if the dictionary is empty the file did not pass
# {
#   2: 'Kowareta kowareta yo kono sekai de',
#   4: 'I LOVE KEN KANEKI <3',
# }

# and print out some kind of output
if not bool(lines_to_return):
  print('"{}" did not contain the search pattern'.format(filename))
else:
  for key in lines_to_return:
    if show_line_numbers:
      print("{} {}".format(key, lines_to_return[key]))
    else:
      print(lines_to_return[key])

一些测试:

grep标志模式文件:-v kaneki filename.txt

Oshiete oshiete yo sono shikumi wo
Boku no naka ni dare ga iru no?
Kowareta kowareta yo kono sekai de
Kimi ga warau nanimo miezu ni
I LOVE KEN KANEKI <3

grep标志模式文件:-i-v kaneki filename.txt

oshiete oshiete yo sono shikumi wo
boku no naka ni dare ga iru no?
kowareta kowareta yo kono sekai de
kimi ga warau nanimo miezu ni

grep标志模式文件:-n-i-v“sekai”filename.txt

0 oshiete oshiete yo sono shikumi wo
1 boku no naka ni dare ga iru no?
3 kimi ga warau nanimo miezu ni
4 i love ken kaneki <3

grep标志模式文件:-n-i“sekai”filename.txt

2 kowareta kowareta yo kono sekai de

相关问题 更多 >