从多个文件读取并将数据存储在列表中
我想要在一个文件夹里查找所有文件,并把每个文件的内容存储到一个列表中,以便后续使用。
我遇到的问题是,当我用打印语句来调试,检查文件是否存在时,它会显示当前文件或者列表中的第一个文件。但是,当我尝试读取这个文件时,它却提示找不到文件。
import re
import os
# Program to extract emails from text files
def path_file():
#path = raw_input("Please enter path to file:\n> ")
path = '/home/holy/thinker/leads/'
return os.listdir('/home/holy/thinker/leads') # returns a list like ["file1.txt", 'image.gif'] # need to remove trailing slashes
# read a file as 1 big string
def in_file():
print path_file()
content = []
for a_file in path_file(): # ['add.txt', 'email.txt']
print a_file
fin = open(a_file, 'r')
content.append(fin.read()) # store content of each file
print content
fin.close()
return content
print in_file()
# this is the error i get
""" ['add.txt', 'email.txt']
add.txt
Traceback (most recent call last):
File "Extractor.py", line 24, in <module>
print in_file()
File "Extractor.py", line 17, in in_file
fin = open(a_file, 'r')
IOError: [Errno 2] No such file or directory: 'add.txt'
"""
我收到的错误信息就是上面这个。
3 个回答
0
这里有一个用 glob
来限制考虑哪些文件的重写版本;
import glob
import os
import re
import sys
if sys.hexversion < 0x3000000:
# Python 2.x
inp = raw_input
else:
# Python 3.xrange
inp = input
def get_dir(prompt):
while True:
dir_name = inp(prompt)
dir_name = os.path.join(os.getcwd(), dir_name)
if os.path.isdir(dir_name):
return dir_name
else:
print("{} does not exist or is not a directory".format(dir_name))
def files_in_dir(dir_name, file_spec="*.txt"):
return glob.glob(os.path.join(dir_name, file_spec))
def file_iter(files):
for fname in files:
with open(fname) as inf:
yield fname, inf.read()
def main():
email_dir = get_dir("Please enter email directory: ")
email_files = files_in_dir(email_dir, "*.eml")
print(email_files)
content = [txt for fname,txt in file_iter(email_files)]
print(content)
if __name__=="__main__":
main()
试运行的结果看起来是这样的
Please enter email directory: c:\temp
['c:\\temp\\file1.eml', 'c:\\temp\\file2.eml']
['file1 line one\nfile1 line two\nfile1 line three',
'file2 line one\nfile2 line two']
0
你应该使用你想要读取的文件的完整路径。
所以请这样做:fin = open(os.path.join(r'/home/holy/thinker/leads/', a_file), 'r')
1
os.listdir
这个命令只会给你返回文件的名字。你需要在文件名之前加上目录的名字。
它试图在你运行程序的同一个文件夹里打开 add.txt
。请在文件名之前加上文件夹的名字。
def path_file():
#path = raw_input("Please enter path to file:\n> ")
path = '/home/holy/thinker/leads/'
return [os.path.join(path, x) for x in os.listdir(path)]