Python:如何将列表合并为簇?
我有一个包含元组的列表:
[(3,4), (18,27), (4,14)]
我需要写一段代码,把那些有重复数字的元组合并成一个新的列表,这个新列表里的所有元素都应该只包含唯一的数字。而且,这个列表还要按照元组的长度进行排序,也就是说:
>>> MergeThat([(3,4), (18,27), (4,14)])
[(3,4,14), (18,27)]
>>> MergeThat([(1,3), (15,21), (1,10), (57,66), (76,85), (66,76)])
[(57,66,76,85), (1,3,10), (15,21)]
我知道这有点像层次聚类算法,我之前读过相关的内容,但还是搞不明白。
有没有比较简单的代码可以实现一个叫 MergeThat() 的函数呢?
6 个回答
1
在编程中,有时候我们需要将一些数据从一个地方传到另一个地方。这就像把水从一个杯子倒到另一个杯子一样。为了做到这一点,我们通常会用到一些工具或者方法。
在这个过程中,可能会遇到一些问题,比如数据格式不对,或者数据丢失。这就像你在倒水的时候,水可能会洒出来一样。因此,我们需要确保在传输数据时,能够正确处理这些问题。
有时候,程序会给我们一些错误提示,这些提示就像是一个信号,告诉我们哪里出错了。我们需要仔细阅读这些提示,找出问题所在,然后进行修正。
总之,数据传输是编程中一个很重要的部分,我们需要掌握一些基本的方法和技巧,才能顺利地完成这个过程。
def collapse(L):
""" The input L is a list that contains tuples of various sizes.
If any tuples have shared elements,
exactly one instance of the shared and unshared elements are merged into the first tuple with a shared element.
This function returns a new list that contain merged tuples and an int that represents how many merges were performed."""
answer = []
merges = 0
seen = [] # a list of all the numbers that we've seen so far
for t in L:
tAdded = False
for num in t:
pleaseMerge = True
if num in seen and pleaseMerge:
answer += merge(t, answer)
merges += 1
pleaseMerge = False
tAdded= True
else:
seen.append(num)
if not tAdded:
answer.append(t)
return (answer, merges)
def merge(t, L):
""" The input L is a list that contains tuples of various sizes.
The input t is a tuple that contains an element that is contained in another tuple in L.
Return a new list that is similar to L but contains the new elements in t added to the tuple with which t has a common element."""
answer = []
while L:
tup = L[0]
tupAdded = False
for i in tup:
if i in t:
try:
L.remove(tup)
newTup = set(tup)
for i in t:
newTup.add(i)
answer.append(tuple(newTup))
tupAdded = True
except ValueError:
pass
if not tupAdded:
L.remove(tup)
answer.append(tup)
return answer
def sortByLength(L):
""" L is a list of n-tuples, where n>0.
This function will return a list with the same contents as L
except that the tuples are sorted in non-ascending order by length"""
lengths = {}
for t in L:
if len(t) in lengths.keys():
lengths[len(t)].append(t)
else:
lengths[len(t)] = [(t)]
l = lengths.keys()[:]
l.sort(reverse=True)
answer = []
for i in l:
answer += lengths[i]
return answer
def MergeThat(L):
answer, merges = collapse(L)
while merges:
answer, merges = collapse(answer)
return sortByLength(answer)
if __name__ == "__main__":
print 'starting'
print MergeThat([(3,4), (18,27), (4,14)])
# [(3, 4, 14), (18, 27)]
print MergeThat([(1,3), (15,21), (1,10), (57,66), (76,85), (66,76)])
# [(57, 66, 76, 85), (1, 10, 3), (15, 21)]
4
import itertools
def merge_it(lot):
merged = [ set(x) for x in lot ] # operate on sets only
finished = False
while not finished:
finished = True
for a, b in itertools.combinations(merged, 2):
if a & b:
# we merged in this iteration, we may have to do one more
finished = False
if a in merged: merged.remove(a)
if b in merged: merged.remove(b)
merged.append(a.union(b))
break # don't inflate 'merged' with intermediate results
return merged
if __name__ == '__main__':
print merge_it( [(3,4), (18,27), (4,14)] )
# => [set([18, 27]), set([3, 4, 14])]
print merge_it( [(1,3), (15,21), (1,10), (57,66), (76,85), (66,76)] )
# => [set([21, 15]), set([1, 10, 3]), set([57, 66, 76, 85])]
print merge_it( [(1,2), (2,3), (3,4), (4,5), (5,9)] )
# => [set([1, 2, 3, 4, 5, 9])]
这里有一段代码示例(包括测试用例):http://gist.github.com/586252
9
我努力想搞明白这个问题,但在尝试了Ian的回答后(谢谢他!),我才意识到理论上的问题是什么:输入是一组边,这些边定义了一个图。我们要找的是这个图的强连通分量。就是这么简单。
虽然你可以高效地做到这一点,但其实没有必要自己去实现这个!只需要导入一个好的图形库就可以了:
import networkx as nx
# one of your examples
g1 = nx.Graph([(1,3), (15,21), (1,10), (57,66), (76,85), (66,76)])
print nx.connected_components(g1) # [[57, 66, 76, 85], [1, 10, 3], [21, 15]]
# my own test case
g2 = nx.Graph([(1,2),(2,10), (20,3), (3,4), (4,10)])
print nx.connected_components(g2) # [[1, 2, 3, 4, 10, 20]]