Python: 使用另一个文件作为键从文件中提取行

2 投票
2 回答
970 浏览
提问于 2025-04-18 01:01

我有一个“密钥”文件,看起来像这样(MyKeyFile):

afdasdfa ghjdfghd wrtwertwt asdf(这些是按列排列的,但我从来没搞清楚格式,抱歉)

我把这些叫做密钥,它们和我想从一个“源”文件中提取的行的第一个单词是一样的。所以源文件(MySourceFile)大概是这样的(格式也不好,但第一列是密钥,后面的列是数据):

afdasdfa(有几列用制表符分隔) . . ghjdfghd(有几列用制表符分隔) . wrtwertwt . . asdf

而“.”表示当前不感兴趣的行。

我在Python方面是个新手,这就是我目前的进展:

with open('MyKeyFile','r') as infile, \
open('MyOutFile','w') as outfile:
    for line in infile:
        for runner in source:
            # pick up the first word of the line in source
            # if match, print the entire line to MyOutFile
            # here I need help
outfile.close()

我意识到可能有更好的方法来做到这一点。任何反馈都很受欢迎——无论是我解决问题的过程,还是更复杂的方法。

谢谢 jd

2 个回答

1

我觉得这样做会更简单一些,假设你的“密钥”文件叫做“key_file.txt”,而你的主文件叫做“main_file.txt”。

keys = []
my_file = open("key_file.txt","r") #r is for reading files, w is for writing to them.
for line in my_file.readlines():
    keys.append(str(line)) #str() is not necessary, but it can't hurt
#now you have a list of strings called keys. 
#take each line from the main text file and check to see if it contains any portion of a given key. 

my_file.close()
new_file = open("main_file.txt","r")
for line in new_file.readlines():
    for key in keys:
        if line.find(key) > -1: 
            print "I FOUND A LINE THAT CONTAINS THE TEXT OF SOME KEY", line

你可以修改打印功能,或者直接去掉它,来处理你想要的那一行,里面包含了一些关键的文本。如果这样有效的话,告诉我一声。

0

根据我的理解(如果我错了请在评论里纠正我),你有三个文件:

  1. MySourceFile(源文件)
  2. MyKeyFile(关键字文件)
  3. MyOutFile(输出文件)

你想要做的事情是:

  1. 从MyKeyFile中读取关键字
  2. 从MySourceFile中读取内容
  3. 逐行检查源文件的内容
  4. 如果某一行的第一个单词在关键字中,就把这一行添加到MyOutFile
  5. 关闭MyOutFile

下面是代码:

with open('MySourceFile', 'r') as sourcefile:
    source = sourcefile.read().splitlines()

with open('MyKeyFile', 'r') as keyfile:
    keys = keyfile.read().split()

with open('MyOutFile', 'w') as outfile:
    for line in source:
        if line.split():
            if line.split()[0] in keys:
                outfile.write(line + "\n")
outfile.close()

撰写回答