从多个文件读取并将数据存储在列表中

1 投票
3 回答
1660 浏览
提问于 2025-04-17 22:39

我想要在一个文件夹里查找所有文件,并把每个文件的内容存储到一个列表中,以便后续使用。

我遇到的问题是,当我用打印语句来调试,检查文件是否存在时,它会显示当前文件或者列表中的第一个文件。但是,当我尝试读取这个文件时,它却提示找不到文件。

import re
import os
# Program to extract emails from text files


def path_file():
    #path = raw_input("Please enter path to file:\n> ")
    path = '/home/holy/thinker/leads/'
    return os.listdir('/home/holy/thinker/leads') # returns a list like ["file1.txt", 'image.gif'] # need to remove trailing slashes

# read a file as 1 big string
def in_file():

    print path_file()
    content = []
    for a_file in path_file(): # ['add.txt', 'email.txt']
        print a_file
        fin = open(a_file, 'r') 
        content.append(fin.read()) # store content of each file
        print content
        fin.close()
    return content


print in_file()

# this is the error i get
""" ['add.txt', 'email.txt']
add.txt
Traceback (most recent call last):
  File "Extractor.py", line 24, in <module>
    print in_file()
  File "Extractor.py", line 17, in in_file
    fin = open(a_file, 'r') 
IOError: [Errno 2] No such file or directory: 'add.txt'
"""

我收到的错误信息就是上面这个。

3 个回答

0

这里有一个用 glob 来限制考虑哪些文件的重写版本;

import glob
import os
import re
import sys

if sys.hexversion < 0x3000000:
    # Python 2.x
    inp = raw_input
else:
    # Python 3.xrange
    inp = input

def get_dir(prompt):
    while True:
        dir_name = inp(prompt)
        dir_name = os.path.join(os.getcwd(), dir_name)
        if os.path.isdir(dir_name):
            return dir_name
        else:
            print("{} does not exist or is not a directory".format(dir_name))

def files_in_dir(dir_name, file_spec="*.txt"):
    return glob.glob(os.path.join(dir_name, file_spec))

def file_iter(files):
    for fname in files:
        with open(fname) as inf:
            yield fname, inf.read()

def main():
    email_dir   = get_dir("Please enter email directory: ")
    email_files = files_in_dir(email_dir, "*.eml")

    print(email_files)

    content = [txt for fname,txt in file_iter(email_files)]
    print(content)

if __name__=="__main__":
    main()

试运行的结果看起来是这样的

Please enter email directory: c:\temp
['c:\\temp\\file1.eml', 'c:\\temp\\file2.eml']
['file1 line one\nfile1 line two\nfile1 line three',
 'file2 line one\nfile2 line two']
0

你应该使用你想要读取的文件的完整路径。

所以请这样做:fin = open(os.path.join(r'/home/holy/thinker/leads/', a_file), 'r')

1

os.listdir 这个命令只会给你返回文件的名字。你需要在文件名之前加上目录的名字。

它试图在你运行程序的同一个文件夹里打开 add.txt。请在文件名之前加上文件夹的名字。

def path_file():
    #path = raw_input("Please enter path to file:\n> ")
    path = '/home/holy/thinker/leads/'
    return [os.path.join(path, x) for x in os.listdir(path)]

撰写回答