如何自动读写文件?

2024-04-20 03:28:14 发布

您现在位置:Python中文网/ 问答频道 /正文

我想读取文件夹中的大量数据,并想删除具有"DT=(SINGLE SINGLE SINGLE)"的行,然后将其作为新数据写入。 在那个数据文件夹里,有300个数据文件!你知道吗

我的密码是

import os, sys
path = "/Users/xxx/Data/"

allFiles = os.listdir(path)

for fname in allFiles:
    print(fname)

    with open(fname, "r") as f:
        with open(fname, "w") as w:
            for line in f:
                if "DT=(SINGLE SINGLE SINGLE)" not in line:
                    w.write(line)
FileNotFoundError: [Errno 2] No such file or directory: '1147.dat'

我想为一群人做这件事。 如何自动读写删除行? 有没有办法用不同的名称创建一个新的数据集?e、 g.1147.dat -> 1147_new.dat


Tags: 数据pathin文件夹forosaswith
1条回答
网友
1楼 · 发布于 2024-04-20 03:28:14

下面应该做;代码演示每个带注释的行之后会做什么:

path = "/Users/xxx/Data/"
allFiles = [os.path.join(path, filename) for filename in os.listdir(path)] # [1]
del_keystring = "DT=(SINGLE SINGLE SINGLE)" # general case

for filepath in allFiles: # better longer var names for clarity
    print(filepath)

    with open(filepath,'r') as f_read: # [2]
        loaded_txt = f_read.readlines()
    new_txt = []
    for line in loaded_txt:
        if del_keystring not in line:
            new_txt.append(line)
    with open(filepath,'w') as f_write: # [2]
        f_write.write(''.join([line for line in new_txt])) # [4]

    with open(filepath,'r') as f_read: # [5]
        assert(len(f_read.readlines()) <= len(loaded_txt))
  • 1os.listdir只返回文件名,不返回文件路径;os.path.join使用分隔符(例如\\):folderpath + '\\' + filename将其输入连接到完整路径
  • [2]与执行with open(X,'r') as .., with open(X,'w') as ..:不同;as 'w'清空文件,因此as 'r'无法读取任何内容
  • [3]如果f_read.read() == "Abc\nDe\n12",那么f_read.read().split('\n')==["Abc,"De","12"]
  • [4]撤消[3]:如果_ls==["a","bc","12"],那么"\n".join([x for x in _ls])=="a\nbc\n12"
  • [5]验证保存文件的行数是否为原始文件的行数的可选代码
  • 注意:您可能会看到保存的文件大小比original的稍大,这可能是因为original的打包、压缩等性能更好—您可以从其文档中看出这一点;[5]确保不会因为行数过多

# bonus code to explicitly verify intended lines were deleted
with open(original_file_path,'r') as txt:
    print(''.join(txt.readlines()[:80])) # select small excerpt
with open(processed_file_path,'r') as txt:
    print(''.join(txt.readlines()[:80])) # select small excerpt
# ''.join() since .readlines() returns a list, delimited by \n


:有关更高级的警告,请参阅下面的注释;有关更紧凑的替代方法,请参阅Torxed's version

相关问题 更多 >