Python: 使用另一个文件作为键从文件中提取行
我有一个“密钥”文件,看起来像这样(MyKeyFile):
afdasdfa ghjdfghd wrtwertwt asdf(这些是按列排列的,但我从来没搞清楚格式,抱歉)
我把这些叫做密钥,它们和我想从一个“源”文件中提取的行的第一个单词是一样的。所以源文件(MySourceFile)大概是这样的(格式也不好,但第一列是密钥,后面的列是数据):
afdasdfa(有几列用制表符分隔) . . ghjdfghd(有几列用制表符分隔) . wrtwertwt . . asdf
而“.”表示当前不感兴趣的行。
我在Python方面是个新手,这就是我目前的进展:
with open('MyKeyFile','r') as infile, \
open('MyOutFile','w') as outfile:
for line in infile:
for runner in source:
# pick up the first word of the line in source
# if match, print the entire line to MyOutFile
# here I need help
outfile.close()
我意识到可能有更好的方法来做到这一点。任何反馈都很受欢迎——无论是我解决问题的过程,还是更复杂的方法。
谢谢 jd
2 个回答
1
我觉得这样做会更简单一些,假设你的“密钥”文件叫做“key_file.txt”,而你的主文件叫做“main_file.txt”。
keys = []
my_file = open("key_file.txt","r") #r is for reading files, w is for writing to them.
for line in my_file.readlines():
keys.append(str(line)) #str() is not necessary, but it can't hurt
#now you have a list of strings called keys.
#take each line from the main text file and check to see if it contains any portion of a given key.
my_file.close()
new_file = open("main_file.txt","r")
for line in new_file.readlines():
for key in keys:
if line.find(key) > -1:
print "I FOUND A LINE THAT CONTAINS THE TEXT OF SOME KEY", line
你可以修改打印功能,或者直接去掉它,来处理你想要的那一行,里面包含了一些关键的文本。如果这样有效的话,告诉我一声。
0
根据我的理解(如果我错了请在评论里纠正我),你有三个文件:
- MySourceFile(源文件)
- MyKeyFile(关键字文件)
- MyOutFile(输出文件)
你想要做的事情是:
- 从MyKeyFile中读取关键字
- 从MySourceFile中读取内容
- 逐行检查源文件的内容
- 如果某一行的第一个单词在关键字中,就把这一行添加到MyOutFile
- 关闭MyOutFile
下面是代码:
with open('MySourceFile', 'r') as sourcefile:
source = sourcefile.read().splitlines()
with open('MyKeyFile', 'r') as keyfile:
keys = keyfile.read().split()
with open('MyOutFile', 'w') as outfile:
for line in source:
if line.split():
if line.split()[0] in keys:
outfile.write(line + "\n")
outfile.close()