删除文本文件中的行

2 投票

2 回答

4364 浏览

提问于 2025-04-15 21:30

我有一个文本文件，里面的内容大致是这样的：

> 1 -4.6    -4.6    -7.6
> 
> 2 -1.7    -3.8    -3.1
> 
> 3 -1.6    -1.6    -3.1

这些数据是用制表符分开的，第一列表示位置。

我需要遍历文本文件中的每个值，除了第一列，然后找出最低的值。

一旦找到了最低的值，就需要把这个值写入一个新的文本文件，同时还要写上列名和位置。第一列的名字是“position”，第二列是“fifteen”，第三列是“sixteen”，第四列是“seventeen”。

比如说，上面数据中的最低值是“-7.6”，它在第三列，也就是“seventeen”。所以我需要把“7.6”、“seventeen”和它的位置值（在这个例子中是1）写入新的文本文件。

接下来，我还需要从上面的文本文件中删除一些行。

例如，上面提到的最低值“-7.6”是在位置“1”找到的，并且是在第三列“seventeen”中。因此，我需要从位置1开始，包括位置1，删除17行。

所以，最低值所在的列决定了需要删除的行数，而它所在的位置则表示删除的起始点。

文件操作文本处理数据清洗数据输出列名提取行删除制表符分隔最低值查找

2 个回答

这是我对你想要的内容的理解（虽然你的要求有点难以捉摸）：

def extract_bio_data(input_path, output_path):
    #open the output file and write it's headers
    output_file = open(output_path, 'w')
    output_file.write('\t'.join(('position', 'min_value', 'rows_skipped')) + '\n')

    #map column indexes (after popping the row number) to the number of rows to skip
    col_index = { 0: 15, 
                  1: 16, 
                  2: 17 }

    skip_to_position = 0
    for line in open(input_path, 'r'):
        #remove the '> ' from the beginning of the line and strip newline characters off the end
        line = line[2:].strip()

        #if the line contains no data, skip it
        if line == '':
            continue

        #split the columns on whitespace (change this to split('\t') for splitting only on tabs)
        columns = line.split()

        #extract the row number/position of this data
        position = int(columns.pop(0))

        #this is where we skip rows/positions
        if position < skip_to_position:  
            continue

        #if two columns share the minimum value, this will be the first encountered in the list
        min_index = columns.index(min(columns, key=float))

        #this is an integer version of the 'column name' which corresponds to the number of rows that need to be skipped
        rows_to_skip = col_index[min_index]

        #write data to your new file (row number, minimum value, number of rows skipped)
        output_file.write('\t'.join(str(x) for x in (position, columns[min_index], rows_to_skip)) + '\n')

        #set the number of data rows to skip from this position
        skip_to_position = position + rows_to_skip


if __name__ == '__main__':
    in_path = r'c:\temp\test_input.txt'
    out_path = r'c:\temp\test_output.txt'
    extract_bio_data(in_path, out_path)

我不太明白的地方：

每行开头真的有"> "吗，还是复制粘贴时出错了？
- 我假设这不是错误。
你想在新文件中写入"7.6"还是"-7.6"？
- 我假设你想要原始值。
你想跳过文件中的某些行吗？还是根据第一列的内容来跳过某些位置？
- 我假设你想跳过位置。
你说你想从原始文件中删除数据。
- 我认为跳过位置就足够了。

回答于 2025-04-15 由 Python大师

分享举报

打开这个文件进行读取，另一个文件进行写入，然后把所有不符合过滤条件的行复制过去：

readfile = open('somefile', 'r')
writefile = open('otherfile', 'w')

for line in readfile:
  if not somepredicate(line):
    writefile.write(line)

readfile.close()
writefile.close()

回答于 2025-04-15 由 Python大师

分享举报

删除文本文件中的行

2 个回答

撰写回答