Python diff并获得新的p

2条回答

网友

1楼 · 编辑于 2024-05-16 00:26:02

如果重复和顺序无关紧要，这很简单：

first = set(open('firstFile').readlines())
second = set(open('secondFile').readlines())

diff = second - first

如果输出顺序很重要：

first = open('firstfile').readlines()
second = open('secondFile').readlines()

diff = [line for line in second if line not in first]

如果输入顺序很重要，那么问题需要澄清。你知道吗

如果文件足够大，将其加载到内存是个坏主意，则可能必须执行以下操作：

secondFile = open('secondFile')
diffFile = open('diffFile')

for secondLine in secondFile:
    match = False
    firstFile = open('firstFile')
    for firstLine in firstFile:
        if firstLine == secondLine:
            match = True
            break
    firstfile.close()
    if not match:
        print >>diffFile, secondLine

secondFile.close()

网友

2楼 · 编辑于 2024-05-16 00:26:02

根据对这个问题的评论，可以这样做：

first = set(x.strip() for x in open("tmp1.txt").readlines())
second = set(x.strip() for x in open("tmp2.txt").readlines())
print second - first

但是，如果我们认真对待“大”，在处理之前加载整个文件可能会占用比机器上可用的内存更多的内存。如果第一个文件足够小，可以放入内存，而第二个文件不够小，则可以执行以下操作：

first = set(x.strip() for x in open("tmp1.txt").readlines())
for line in open("tmp2.txt").xreadlines():
    line = line.strip()
    if line not in first:
        print line

如果第一个文件太大，我想你需要求助于数据库。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python diff并获得新的p

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >