通过检查前3列比较两个文件。如果它们不是相同的值,则打印整行(python)

2024-05-14 06:26:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我是python和Stackoverflow的新手。如果我没有正确解释我的问题,请原谅

First file (test1.txt): 

customer    ID    age country  version

 - Alex     #1233  25  Canada     7 
 - James    #1512  30  USA        2 
 - Hassan   #0051  19  USA        9



Second file (test2.txt): 

customer     ID    age country  version

 - Alex     #1233  25  Canada    3 
 - James    #1512  30  USA       7 
 - Bob      #0061  20  USA       2 
 - Hassan   #0051  19  USA       1

Results for the missing lines should be

Bob #0061 20 USA  2

这是密码

    missing = []  
with open('C:\\Users\\yousi\\Desktop\\Work\\Python Project\\test1.txt.txt','r') as a_file:
    a_lines = a_file.read().split('\n')

with open('C:\\Users\\yousi\\Desktop\\Work\\Python Project\\test2.txt.txt','r') as b_file:
    b_lines = b_file.read().split('\n')


for line_a in a_lines:   
    for line_b in b_lines: 
        if line_a in line_b:
            break
    else: 

        missing.append(line_a)

print(missing)
a_file.close()
b_file.close()

这段代码的问题是它基于整行比较了两个文件。我只想检查前3列,如果它们不匹配,那么它会打印整行

新例子:

First file (test1.txt)

60122 LX HNN --   4  32.7390  -114.6357     40 Winterlaven - Sheriff Sabstation
60122 LX HNZ --   4  32.7390  -114.6357     40 Winterlaven - Sheriff Sabstation
60122 LX HNE --   4  32.7390  -114.6357     40 Winterlaven - Sheriff Sabstation


second file (test2.txt)

60122 LX HNN --   4  32.739000   -114.635700   40   Winterlaven - Sheriff Sabstation        
60122 LX HNZ --   4  32.739000   -114.635700   40   Winterlaven - Sheriff Sabstation        
60122 LX HNE --   4  32.739000   -114.635700   40   Winterlaven - Sheriff Sabstation 

Tags: intxtforlinefilefirstlineslx
2条回答

如果test1.txttest2.txt包含问题的文本,则此脚本:

with open('test1.txt', 'r') as f1, open('test2.txt', 'r') as f2:
    i1 = [line.split()[:-1] for line in f1 if line.strip().startswith('-')]
    i2 = (line.split() for line in f2 if line.strip().startswith('-'))
    missing = [line for line in i2 if line[:-1] not in i1]

for _, *line in missing:
    print(' '.join(line))

印刷品:

Bob #0061 20 USA 2

编辑:如果文件的行开头不包含-,则此脚本:

with open('test1.txt', 'r') as f1, open('test2.txt', 'r') as f2:
    i1 = [line.split()[:-1] for line in f1 if line.strip()]
    i2 = (line.split() for line in f2 if line.strip())
    missing = [line for line in i2 if line[:-1] not in i1]

for line in missing:
    print(' '.join(line))

印刷品:

Bob #0061 20 USA 2

编辑2:要仅比较前3列,可以使用此示例(注意[:3]):

with open('file1.txt', 'r') as f1, open('file2.txt', 'r') as f2:
    i1 = [line.split()[:3] for line in f1 if line.strip()]
    i2 = (line.split() for line in f2 if line.strip())
    missing = [line for line in i2 if line[:3] not in i1]

for line in missing:
    print(' '.join(line))

对于问题中的新示例文件,不打印任何内容

如果要比较前3列,应该这样做

a_line = 'Alex 1233 25 Canada'  # this is one file's line

# slipt line on white 
a_line = a_line.split()
>>> ['Alex', '1233', '25', 'Canada']

# cat first 3 columns
a_line = a_line[:3]
>>> ('Alex', '1233', '25')

# than you can compare
['Alex', '1233', '25', 'Canada'] == ['Alex', '1233', '25', 'Canada']
>>> True

['Alex', '1233', '25', 'Canada'] == ['Alex', '1233', '25', 'Canada2']
>>> False

不使用read().split('\n'),您可以只使用readlines()

相关问题 更多 >