在大量列上连接两个数据帧

2024-04-29 00:45:15 发布

您现在位置:Python中文网/ 问答频道 /正文

对于大量的列,我必须使用concatenate函数。这是我的职责。你知道吗

pd.concat([mdf1[['user','tag1','tag2','tag3','tag4']].groupby(['user']).agg(sum)

在这里,我有大量的标签,所以我想我的函数采取所有列后说'tag1'我怎么做? mdf1型

        user        page_name            category  tag1  tag2  tag3
0  random guy        BlackBuck   Transport/Freight     1     1     0
1   mank nion        DJ CHETAS  Arts/Entertainment     0     1     1
2  random guy      GiveMeSport               Sport     1     0     1
3   mank nion  Gurkeerat Singh      Actor/Director     1     0     1

mdf2型

          user         page_name            category  tag1  tag2  tag3
0   pop rajuel      WOW Editions        Concert Tour   NaN   NaN   NaN
1  Roshan ghai            MensXP  News/Media Website   NaN   NaN   NaN
2    mank nion     Celina Jaitly             Actress   NaN   NaN   NaN
3   pop rajuel      500 Startups            App Page   1.0   0.0   1.0
4  Roshan ghai          No Abuse           Community   NaN   NaN   NaN
5   random guy  Analytics Ninja    Insurance Company   NaN   NaN   NaN
6   pop rajuel  Biswapati Sarkar      Actor/Director   1.0   0.0   0.0
7  Roshan ghai     the smartian        Public Figure   0.0   1.0   1.0

输出

      user  tag1  tag2  tag3
0    mank nion   1.0   1.0   2.0
1   random guy   2.0   1.0   1.0
2  Roshan ghai   0.0   1.0   1.0
3    mank nion   NaN   NaN   NaN
4   pop rajuel   2.0   0.0   1.0
5   random guy   NaN   NaN   NaN

我想申请的唯一区别是我有大量的列,即“tag4”和“tag5”。所以我想让我的代码在'tag1'后面的所有列,在这段代码中,我基本上连接了2个mdf,在对用户进行分组和求和之后。你知道吗


Tags: 函数randomnanpopnionuserguytag1
1条回答
网友
1楼 · 发布于 2024-04-29 00:45:15

我认为你需要^{}^{}并聚合^{}

df = pd.concat([mdf1,mdf2])
print (df)
          user         page_name            category  tag1  tag2  tag3
0   random guy         BlackBuck   Transport/Freight   1.0   1.0   0.0
1    mank nion         DJ CHETAS  Arts/Entertainment   0.0   1.0   1.0
2   random guy       GiveMeSport               Sport   1.0   0.0   1.0
3    mank nion   Gurkeerat Singh      Actor/Director   1.0   0.0   1.0
0   pop rajuel      WOW Editions        Concert Tour   NaN   NaN   NaN
1  Roshan ghai            MensXP  News/Media Website   NaN   NaN   NaN
2    mank nion     Celina Jaitly             Actress   NaN   NaN   NaN
3   pop rajuel      500 Startups            App Page   1.0   0.0   1.0
4  Roshan ghai          No Abuse           Community   NaN   NaN   NaN
5   random guy   Analytics Ninja   Insurance Company   NaN   NaN   NaN
6   pop rajuel  Biswapati Sarkar      Actor/Director   1.0   0.0   0.0
7  Roshan ghai      the smartian       Public Figure   0.0   1.0   1.0

print (df.groupby('user', as_index=False).sum())
          user  tag1  tag2  tag3
0  Roshan ghai   0.0   1.0   1.0
1    mank nion   1.0   1.0   2.0
2   pop rajuel   2.0   0.0   1.0
3   random guy   2.0   1.0   1.0

page_namecategory被省略,因为automatic exclusion of nuisance columns。你知道吗

相关问题 更多 >