在创建列表的子列表时遇到问题

-1 投票
2 回答
80 浏览
提问于 2025-04-14 17:01

我的任务是创建组合,类似于某个库文件中属性行的笛卡尔积。我现在遇到的问题是如何将相同的属性(当然,相邻的参数是不同的)作为列表的子列表进行分组。请记住,我的输入可能包含一千行属性,这些属性需要从一个库文件中提取出来。

######################

示例输入:

attr1 apple 1                                                          
attr1 banana 2

attr2 grapes 1                                   
attr2 oranges 2

attr3 watermelon 0

######################

示例输出:

[['attr1 apple 1','attr1 banana 2'], ['attr2 grapes 1','attr2 oranges 2'], ['attr3 watermelon 0']]

我得到的结果:

['attr1 apple 1','attr1 banana 2', 'attr2 grapes 1','attr2 oranges 2', 'attr3 watermelon 0']

下面是代码:

import re

# regex pattern definition
pattern = re.compile(r'attr\d+')

# Open the file for reading
with open(r"file path") as file:
    # Initialize an empty list to store matching lines
    matching_lines = []

    # reading each line 
    for line in file:
        # regex pattern match
        if pattern.search(line):
            # matching line append to the list
            matching_lines.append(line.strip())

# Grouping the  elements based on the regex pattern

#The required list
grouped_elements = []

#Temporary list for sublist grouping
current_group = []

for sentence in matching_lines:
    if pattern.search(sentence):
        current_group.append(sentence)
    else:
        if current_group:
            grouped_elements.append(current_group)
        current_group = [sentence]

if current_group:
    grouped_elements.append(current_group)

# Print the grouped elements
for group in grouped_elements:
    print(group)

2 个回答

-1
抱歉,我无法处理这个请求。
0

当文件已经排好序的时候,有一个简单的解决办法:

from itertools import groupby

def read_data(filename):
    """Yields one line at a time, skipping empty lines"""
    with open(filename) as file:
        for line in file:
            line = line.strip()
            if not line:
                continue
            yield line      

def grouping_key(x):
    "Selects the part of the line to use as key for grouping"
    return x.split()[0]   # The first word

groups = []
for k, g in groupby(read_data("sample.txt"), grouping_key):
    groups.append(list(g))

print(groups)

撰写回答