在python3.0中同时读取两个文本文件并提取所需字符串

first_occurance = {} with open("folder_1/file_1", "r") as file_1: with open("folder_1/file_2", "r") as file_2: for line_1,line_2 in zip(file_1, file_2): only_command = line_1.split()[0] if only_command in line_2: if only_command not in first_occurance: print ("\n " + only_command + " :\n") print (" > " + line_1.strip()) else: print (" > " + line_1.strip()) first_occurance[only_command] = only_command

1条回答

网友

1楼 · 发布于 2024-04-26 11:14:14

以下是我认为你可能要做的：

from collections import defaultdict

data = """data1 data_1 1
data2 data_2 2
data1 data_3 2
data3 data_4 1
data2 data_3 1"""

commands = """data1
data2
data1
data3
data2"""

store = defaultdict(list)

for line, cmd in zip(data.split('\n'), commands.split('\n')):
    if line.startswith(cmd):
        store[cmd].append(line.strip())

for command in sorted(store):
    print("\n{}:".format(command))
    for l in store[command]:
        print("      >", l)

这将产生以下输出：

data1:
      > data1 data_1 1
      > data1 data_3 2

data2:
      > data2 data_2 2
      > data2 data_3 1

data3:
      > data3 data_4 1

对于命令中的每一行（从file_2读取的内容），如果数据中的完全相同的行（file_1）以相同的“command”开头，则存储该行。顺便说一句，你改变了很多数据，我不知道我们了解你想要什么。似乎，文件2甚至是无用的，或者您可能想重新调整您的数据？你知道吗

不管怎样，在您存储了分组数据之后，您可以按排序顺序打印组（data1，2，3…）。您必须存储所有组，否则您必须为每个（数据）组反复读取文件。如果你没有得到你当前的输出-因为你在收到数据的时候打印数据。你知道吗

然而，似乎根本不需要您的file_2数据，至少根据您想要的问题输出是这样的。因此，下面是生成所需输出的文件读取版本；请注意，它不需要读取file_2：

from collections import defaultdict

store = defaultdict(list)

with open("folder_1/file_1", "r") as data:
    for line in data:
        cmd, content = line.split(' ', 1)
        store[cmd].append(line.strip())

for cmd in sorted(store):
    print("\n{}:".format(cmd))
    for line in store[cmd]:
        print("      >", line)

相关问题更多 >

编程相关推荐

热门问题

热门文章