如何检查文件A中的内容是否存在于目录中的文件内容中
我有一个文件,里面有好几行文字,比如:
cat
dog
rabbit
我想要遍历一个文件夹,检查里面的文本文件是否包含上面提到的那些内容。
我尝试了很多方法,换了很多种方式。我不想发任何东西,因为我想从头开始……换个思路。我写的下面这段代码搞到我自己都搞不懂发生了什么,完全迷失了。 :(
#! /usr/bin/python
'''
The purpose of this program
is to search the OS file system
in order to find a txt file that contain the nagios host entries
'''
import os
host_list = open('/path/path/list', 'r')
host = host_list.read()
##for host in host_remove.read():
host_list.close()
#print host
for root, dirs, files in os.walk("/path/path/somefolder/"):
for file in files:
if file.endswith(".txt"):
check_file = os.path.join(root, file)
#print check_file
if host.find(check_file): #in check_file:
print host.find(check_file)
#print host+" is found in "+check_file
#print os.path.join(root, file)
else:
break
3 个回答
0
我对J.F. Sebastian提供的算法做了一些小改动。 这些改动会让程序询问用户输入。同时,它在Windows系统上运行也没有问题。
#!/usr/bin/env python
import os
import re
import sys
contents = raw_input("Please provide the full path and file name that contains the items you would like to search for \n")
print "\n"
print "\n"
direct = raw_input("Please provide the directory you would like to search. \
Use C:/, if you want to search the root directory on a windows machine\n")
def files_with_matched_lines(topdir, matched):
for root, dirs, files in os.walk(topdir, topdown=True):
dirs[:] = [d for d in dirs if not d.startswith('.')] # skip "hidden" dirs
for filename in files:
if filename.endswith(".txt"):
path = os.path.join(root, filename)
try:
with open(path) as file:
for line in file:
if matched(line):
yield path
break
except EnvironmentError as e:
print >>sys.stderr, e
with open(contents) as file:
hosts = file.read().splitlines()
matched = re.compile(r"\b(?:%s)\b" % "|".join(map(re.escape, hosts))).search
for path in files_with_matched_lines(direct, matched):
print path
2
用Python来做这个事情实在是太复杂了。直接用 grep
就可以了:
$ grep -wFf list_of_needles.txt some_target.txt
如果你真的需要用Python,可以把 grep
的调用放在 subprocess
里,或者用类似的方式。
2
这个内容是关于一个和命令行指令相似的东西:
$ find /path/somefolder/ -name \*.txt -type f -exec grep -wFf /path/list {} +
在Python中:
#!/usr/bin/env python
import os
import re
import sys
def files_with_matched_lines(topdir, matched):
for root, dirs, files in os.walk(topdir, topdown=True):
dirs[:] = [d for d in dirs if not d.startswith('.')] # skip "hidden" dirs
for filename in files:
if filename.endswith(".txt"):
path = os.path.join(root, filename)
try:
with open(path) as file:
for line in file:
if matched(line):
yield path
break
except EnvironmentError as e:
print >>sys.stderr, e
with open('/path/list') as file:
hosts = file.read().splitlines()
matched = re.compile(r"\b(?:%s)\b" % "|".join(map(re.escape, hosts))).search
for path in files_with_matched_lines("/path/somefolder/", matched):
print path