如何识别重复的ID并分配新ID？

for index, row in df.iterrows(): plasmidlist = [row[1]] if duplicate == True: #Is their a dublicate function I can use? plasmidlist.append(duplicaterow[1]) drop(dublicaterow) df.at[row,'Plasmid']= plasmidlist

3条回答

网友

1楼 · 编辑于 2024-04-29 04:52:02

如果您的解析算法正常工作，我将使用字典结构来完成此任务。在Python中，您可以轻松检查列表中是否存在项：

     for each item in parent_list:
       if item is in plasmid_list:
          # do thing

网友

2楼 · 编辑于 2024-04-29 04:52:02

您可以将^{}与.apply(list)一起使用：

df = pd.DataFrame({'Oligo_sequence':['ATG', 'ATG', 'CAG'], 'Plasmid':['Plasmid A', 'Plasmid B', 'Plasmid C']})

print(df.groupby('Oligo_sequence')['Plasmid'].apply(list).reset_index())

印刷品：

  Oligo_sequence                 Plasmid
0            ATG  [Plasmid A, Plasmid B]
1            CAG             [Plasmid C]

网友

3楼 · 编辑于 2024-04-29 04:52:02

将groupby和agg与列表一起使用：

df.groupby('Oligo_sequence')['Plasmid'].agg(list)

输出：

"ATG"    ["Plasmid A", "Plasmid B"]
"CAG"                 ["Plasmid C"]
Name: Plasmid, dtype: object

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何识别重复的ID并分配新ID？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >