具有共享元素的python查找集

2024-03-29 12:33:59 发布

您现在位置:Python中文网/ 问答频道 /正文

例如,我的数据是一组冻结集

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])

预期结果是具有重复元素的frozenset集,即

result = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]),  frozenset([1,1000, 2000])])

这里frozenset([100,200])被删除,因为它不与其他冻结集共享任何元素。什么是实现这一目标的有效方法?你知道吗


Tags: 数据方法元素目标dataresultsetfrozenset
3条回答

您可以构建一个dict集合元素来计算它们被找到的次数,然后删除所有元素的计数为1的任何frozensetcollections.Counter会很方便。你知道吗

它的优点是O(n),其中n是所有集合中元素的总数。你知道吗

from collections import Counter

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
counts = Counter(elt for fs in data for elt in fs)
result = {fs for fs in data if any(counts[elt] > 1 for elt in fs)}

# {frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}

我会用这样的检查来做一个集合理解(对于每个项目,检查它是否有至少一个其他元素的公共元素):

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])

new_data = {x for x in data if any(not x.isdisjoint(y) for y in data if y!=x)}

print(new_data)

结果:

{frozenset({1, 2, 3, 4}), frozenset({3, 4, 5, 6, 7, 8}), frozenset({1000, 1, 2000})}

可能有更有效的解决方案,但至少disjoint部分是由有效的set例程处理的

这是我的版本,它没有任何特别的优势,但你可能会发现它更可读。你知道吗

data = set([frozenset([1,2,3,4]), frozenset([3,4,5,6,7,8]), frozenset([100,200]), frozenset([1,1000, 2000])])
result = set()

for item in data:
    for element in item:
        for other_item in data:
            if item != other_item and item not in result:
                if element in other_item:
                    result.add(item)
                    break
>>>print(result)
>>>{frozenset({1, 2, 3, 4}), frozenset({1000, 1, 2000}), frozenset({3, 4, 5, 6, 7, 8})}

相关问题 更多 >