擅长:python、mysql、java
<p>在添加到数据帧之前,将这两列排序在一起,这样就可以保证一对只按特定顺序出现。然后才应用你的计数方法。使用<a href="https://stackoverflow.com/questions/51182228/python-delete-duplicates-in-a-dataframe-based-on-two-columns-combinations?noredirect=1&lq=1">link</a>中的方法进行排序:</p>
<pre><code>import pandas as pd
import networkx as nx
mylist = [[('Smith JR','Kim YY'),('Smith JR','Ron AA'),('Kim YY','Ron AA')],[('Kim YY','Smith JR')],[('Smith JR','Ron AA')]]
flat_list = [item for sublist in mylist for item in sublist]
df = pd.DataFrame(flat_list, columns=["From", "To"])
#create a new dataframe with the value pairs sorted. You can also sort earlier if you prefer.
df = pd.DataFrame(np.sort(df[["From", "To"]]), columns = ["From", "To"])
#now, just apply the groupby.
df_graph = df.groupby(["From", "To"], axis=0).size().reset_index()
#Output:
From To 0
0 Kim YY Ron AA 1
1 Kim YY Smith JR 2
2 Ron AA Smith JR 2
</code></pre>