如何检查文件A中的内容是否存在于目录中的文件内容中

0 投票
3 回答
652 浏览
提问于 2025-04-18 18:52

我有一个文件,里面有好几行文字,比如:

cat
dog
rabbit

我想要遍历一个文件夹,检查里面的文本文件是否包含上面提到的那些内容。

我尝试了很多方法,换了很多种方式。我不想发任何东西,因为我想从头开始……换个思路。我写的下面这段代码搞到我自己都搞不懂发生了什么,完全迷失了。 :(

#! /usr/bin/python

'''
The purpose of this program
is to search the OS file system
in order to find a txt file that contain the nagios host entries
'''

import os

host_list = open('/path/path/list', 'r')

host = host_list.read()
##for host in host_remove.read():

host_list.close()
#print host

for root, dirs, files in os.walk("/path/path/somefolder/"):
    for file in files:
        if file.endswith(".txt"):

            check_file = os.path.join(root, file)
            #print check_file


            if host.find(check_file): #in check_file:

                print host.find(check_file)                    
                #print host+" is found in "+check_file
                #print os.path.join(root, file)
            else:
                break

3 个回答

0

我对J.F. Sebastian提供的算法做了一些小改动。 这些改动会让程序询问用户输入。同时,它在Windows系统上运行也没有问题。

#!/usr/bin/env python
import os
import re
import sys

contents = raw_input("Please provide the full path and file name that contains the items you would like to search for \n")
print "\n"
print "\n"
direct = raw_input("Please provide the directory you would like to search. \
Use C:/, if you want to search the root directory on a windows machine\n")

def files_with_matched_lines(topdir, matched):
    for root, dirs, files in os.walk(topdir, topdown=True):
        dirs[:] = [d for d in dirs if not d.startswith('.')] # skip "hidden" dirs
        for filename in files:
            if filename.endswith(".txt"):
                path = os.path.join(root, filename)
                try:
                    with open(path) as file:
                        for line in file:
                            if matched(line):
                                yield path
                                break
                except EnvironmentError as e:
                    print >>sys.stderr, e

with open(contents) as file:
    hosts = file.read().splitlines()
matched = re.compile(r"\b(?:%s)\b" % "|".join(map(re.escape, hosts))).search
for path in files_with_matched_lines(direct, matched):
    print path
2

用Python来做这个事情实在是太复杂了。直接用 grep 就可以了:

$ grep -wFf list_of_needles.txt some_target.txt

如果你真的需要用Python,可以把 grep 的调用放在 subprocess 里,或者用类似的方式。

2

这个内容是关于一个和命令行指令相似的东西:

$ find /path/somefolder/ -name \*.txt -type f -exec grep -wFf /path/list {} +

在Python中:

#!/usr/bin/env python
import os
import re
import sys

def files_with_matched_lines(topdir, matched):
    for root, dirs, files in os.walk(topdir, topdown=True):
        dirs[:] = [d for d in dirs if not d.startswith('.')] # skip "hidden" dirs
        for filename in files:
            if filename.endswith(".txt"):
                path = os.path.join(root, filename)
                try:
                    with open(path) as file:
                        for line in file:
                            if matched(line):
                                yield path
                                break
                except EnvironmentError as e:
                    print >>sys.stderr, e

with open('/path/list') as file:
    hosts = file.read().splitlines()
matched = re.compile(r"\b(?:%s)\b" % "|".join(map(re.escape, hosts))).search
for path in files_with_matched_lines("/path/somefolder/", matched):
    print path

撰写回答