将行中所有字符的总和返回到另一列

2024-04-28 22:54:33 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有这个数据帧df

column1      column2                                            column3
amsterdam    school yeah right backtic escapes sport swimming   2016
rotterdam    nope yeah                                          2012
thehague     i now i can fly no you cannot swimming rope        2010
amsterdam    sport cycling in the winter makes me               2019

如何获取第2列中每行所有字符(不包括空格)的总和,并将其返回到新的第4列,如下所示:

column1      column2                                            column3    column4
amsterdam    school yeah right backtic escapes sport swimming   2016       70
rotterdam    nope yeah                                          2012       8
thehague     i now i can fly no you cannot swimming rope        2010       65
amsterdam    sport cycling in the winter makes me               2019       55

我尝试了这段代码,但到目前为止,我得到了column2中每行所有字符的总和:

df['column4'] = sum(list(map(lambda x : sum(len(y) for y in x.split()), df['column2'])))

因此,当前我的df如下所示:

column1      column2                                            column3    column4
amsterdam    school yeah right backtic escapes sport swimming   2016          250
rotterdam    nope yeah                                          2012           250
thehague     i now i can fly no you cannot swimming rope        2010           250
amsterdam    sport cycling in the winter makes me               2019           250

有人知道吗


Tags: inrightdfcolumn1schoolrotterdamamsterdamsport
3条回答

可以将方法count与正则表达式模式一起使用:

df['column2'].str.count(pat='\w')

输出:

0    42
1     8
2    34
3    30
Name: column2, dtype: int64

嗨,这对我有用

import pandas as pd
df=pd.DataFrame({'col1':['Stack Overflow','The Guy']})
df['Count Of Chars']=df['col1'].str.replace(" ","").apply(len)
df

输出

    col1    Count Of characters
0   Stack Overflow  13
1   The Guy          6

在解决方案中使用自定义lambda函数:

df['column4'] = df['column2'].apply(lambda x: sum(len(y) for y in x.split()))

或者获取所有值的计数并用^{}减去空白的计数:

df['column4'] = df['column2'].str.len().sub(df['column2'].str.count(' '))
#rewritten to custom functon
#df['column4'] = df['column2'].map(lambda x: len(x) - x.count(' '))
print (df)
     column1                                           column2  column3  \
0  amsterdam  school yeah right backtic escapes sport swimming     2016   
1  rotterdam                                         nope yeah     2012   
2   thehague       i now i can fly no you cannot swimming rope     2010   
3  amsterdam              sport cycling in the winter makes me     2019   

   column4  
0       42  
1        8  
2       34  
3       30  

相关问题 更多 >