在文件中搜索字符串并复制后续所有行直到字符串2

1 投票
1 回答
2371 浏览
提问于 2025-04-17 14:12

我正在用Python3写一个脚本,但遇到了一个问题,解决不了。

我有一个名字的列表,格式是这样的:

ZINC123456
ZINC234567
ZINC345678
ZINC456789
...

还有一个很大的文件,内容是这样的:

ZINC123456
xxx
xxx
xxx
ZINC987654
xxy
xxy
xxy
xxy
ZINC654987
...

我想做的是:遍历第一个列表中的每一个项目,然后在第二个文件中搜索这个项目。当找到这个项目时,就把这一行和后面的所有内容复制到一个新文件里,直到遇到下一个ZINCxxxxxx的格式为止。

我该怎么做呢?非常感谢你的帮助!

补充:感谢Sudipta Chatterjee,我找到了以下解决方案:

import sys
finZ=open(sys.argv[1],'r')
finX=open('zinc.sdf','r')
fout=open(sys.argv[1][:7]+'.sdf','w')

list=[]
thislinehaszinc = False
zincmatching    = False

for zline in finZ:
if zline[0:4] == "ZINC":
    name = zline[:-1] #line[4:-1]
    if name not in list:
        list.append(name)

for xline in finX:
if xline[0:4] == "ZINC":
    thislinehaszinc = True
    zincmatching    = False
    for line in list:
        if line == xline[:-1]:
            zincmatching    = True
            fout.write(xline)
            print('Found: '+xline)
            pass
        else:
            pass
else:
    thislinehaszinc = False

if thislinehaszinc == False and zincmatching == True:
    fout.write(xline)

1 个回答

-1
# Clarified from comments - the program is to act as a filter so that any lines
# which have a pattern 'ZINC' in the second file but do not belong in the first
# should stop the dump until the next matching zinc is found

fileZ = open ('file_with_zinc_only.txt', 'r').readlines()
fileX = open ('file_with_x_info.txt', 'r').readlines()
fileOutput = open ('file_for_output.txt', 'w')

thisLineHasZinc = False
zincMatching = False

for xline in fileX:
    #print "Dealing with", xline
    if len(xline.split('ZINC')) != 1:
        thisLineHasZinc = True
        zincMatching = False
        for zline in fileZ:
            #print "Trying to match",zline
            if zline == xline:
                #print "************MATCH***************"
                zincMatching = True
                fileOutput.write (zline)
                #print "**",xline
                break
    else:    
        thisLineHasZinc = False

    # If we are currently under a block where we've found a ZINC previously
    # but not yet reached another ZINC line, write to file
    #print 'thisLineHasZinc',thisLineHasZinc,'zincMatching',zincMatching
    if thisLineHasZinc == False and zincMatching == True:
        fileOutput.write (xline)
        #print "**** "+ xline

fileOutput.close()

当然可以!请把你想要翻译的内容发给我,我会帮你用简单易懂的语言解释清楚。

撰写回答