合并行以删除CSV Python和Pandas中的重复项

2024-06-16 14:56:23 发布

男 | 程序猿一只，喜欢编程写python代码。

我尝试使用python和pandas将多组行组合在一起，以删除CSV中的重复项。基于一个公共值“ID”，其中有重复的行，另一列“hostimpled”中的值应与换行符组合。与这篇文章类似：enter link description here但是我需要保留所有现有的值，这些值等于同一个ID。我已经用下面的代码作为例子对列做了类似的处理，但是它并不完全相同：

df = pd.read_csv("output.csv")

cols = ['Host','Protocol','Port']
newcol = ['/'.join(i) for i in zip(df['Host'],df['Protocol'],df['Port'].map(str))]
df = df.assign(HostAffected=newcol).drop(cols, 1)

到目前为止我有这个代码：

^{pr2}$

改编自此线程：enter link description here但是这不起作用。在

我想要的一组数据示例如下：

PluginID    Description HostAffected
10395   Windows SMB Shares Enumeration  10.0.0.10/tcp/445
10396   Windows SMB Shares Access   10.0.0.10/tcp/445
10396   Windows SMB Shares Access   192.168.0.12/tcp/445
10398   Windows SMB LsaQueryInformationPolicy   10.0.0.10/tcp/445
10399   SMB Use Domain SID to Enumerate Users   10.0.0.10/tcp/445
10400   Windows SMB Registry Remotely Accessible    10.0.0.10/tcp/445
10736   DCE Services Enumeration    10.0.0.10/tcp/139
10736   DCE Services Enumeration    10.0.0.10/tcp/445
10736   DCE Services Enumeration    192.168.0.12/tcp/445

这些值是用逗号分隔的，但是我用空格使它更清楚。我希望它看起来像这样，“Plugin ID”和“Description”只有一个唯一的行，并且“hostimpacted”列组合在一起：

ID  Description HostAffected
10395   Windows SMB Shares Enumeration  10.0.0.10/tcp/445
10396   Windows SMB Shares Access   10.0.0.10/tcp/445
192.168.0.12/tcp/445
10398   Windows SMB LsaQueryInformationPolicy   10.0.0.10/tcp/445
10399   SMB Use Domain SID to Enumerate Users   10.0.0.10/tcp/445
10400   Windows SMB Registry Remotely Accessible    10.0.0.10/tcp/445
10736   DCE Services Enumeration    10.0.0.10/tcp/139
10.0.0.10/tcp/445
192.168.0.12/tcp/445

本质上，多组受感染主机可能有相同的ID和描述。任何帮助都将不胜感激，因为这比将列组合在一起稍微复杂一些，也更具挑战性。在

Tags： id df here access windows link description tcp

1条回答

网友

1楼 · 发布于 2024-06-16 14:56:23

在注释之后，我们得到了^{}如果traling wthits与apply和{}之间的空间line break：

df['Description'] = df['Description'].str.strip()

(df.groupby(['Plugin ID','Issue'])['HostAffected']
   .apply('\n'.join)
   .reset_index())

合并行以删除CSV Python和Pandas中的重复项

相关问题更多 >

编程相关推荐

热门问题

热门文章

合并行以删除CSV Python和Pandas中的重复项

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >