<p>如果我理解正确的话,你要做的是识别图中的连接成分,其中每个节点是一个原子,每个边是一个键(因此,一个连接成分是一个分子)。在<a href="https://docs.scipy.org/doc/scipy/reference/sparse.csgraph.html" rel="nofollow noreferrer">^{<cd1>}</a>中有一个有效的实现。你知道吗</p>
<p>首先让我们把这个图设为一个稀疏矩阵:</p>
<pre><code>import scipy.sparse as sps
# Input as provided
edges = [[1,2],[1,3],[1,4],[1,5],[5,6],[5,7],[5,8],[9,10],[9,11],[9,12],[9,13]]
# Modify the input by adding, for each [x,y], also [y,x].
# Also transform it to a set and then again to a list
# to assure that we don't duplicate anything.
edges = list({(x[0],x[1]) for x in edges}.union({(x[1],x[0]) for x in edges}))
# Create it as a matrix. The weights of all edges are set to 1,
# as they don't matter anyway.
graph = sps.csr_matrix(([1]*len(edges), np.array(edges).T))
</code></pre>
<p>此时,只需调用<a href="https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csgraph.connected_components.html#scipy.sparse.csgraph.connected_components" rel="nofollow noreferrer">^{<cd2>}</a>,但默认情况下输出的格式略有不同:</p>
<blockquote>
<p>(3, array([0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2]))</p>
</blockquote>
<p>让我们稍微修改一下:</p>
<pre><code>from scipy.sparse import csgraph
connected_components = csgraph.connected_components(graph)
result = []
for u in range(1, connected_components[0]):
result.append(np.where(connected_components[1]==u)[0])
result
</code></pre>
<blockquote>
<p>[array([1, 2, 3, 4, 5, 6, 7, 8], dtype=int64),</p>
<p>array([ 9, 10, 11, 12, 13], dtype=int64)]</p>
</blockquote>
<p>还要注意,在<code>range</code>中,我是从1开始的,因为Python标准从0开始计数,而这将作为一个孤立的节点出现,因为您是从1开始的。如果原子的编号是非连续的,则需要跳过孤立节点,例如:</p>
<pre><code>result = [r for r in result if len(r) > 1]
</code></pre>