Python计算两列之间匹配单词数量
我想计算一个单词列表在某一列中出现的次数。这里是我的数据框:
original people result
John is a good friend John, Mary 1
Mary and Peter are going to marry Peter, Mary 2
Bond just met the Bond girl Bond 2
Chris is having dinner NaN 0
All Marys are here Mary 0
我试着使用这里建议的代码 检查数据框中的一列是否包含另一列的单词:
import pandas as pd
import re
df['result'] = [', '.join([p for p in po
if re.search(f'\\b{p}\\b', o)) ]
for o, po in zip(df.original, df.people.str.split(',\o*'))
]
# And after I would try to calculate the number of words in column 'result'
但是我收到了以下信息:
error: bad escape \o at position 1
有没有人能给点建议?
2 个回答
3
在两个列上使用 split
方法,然后检查“Original”中的每个单词是否出现在“people”中:
df["people"] = df["people"].fillna("")
df["result"] = [sum(w in ws for w in s.split()) for s, ws in zip(df["original"], df["people"].str.split(', '))]
>>> df
original people result
0 John is a good friend John, Mary 1
1 Mary and Peter are going to marry Peter, Mary 2
2 Bond just met the Bond girl Bond 2
3 Chris is having dinner 0
4 All Marys are here Mary 0
2
在编程中,有时候我们需要把一些数据从一个地方传到另一个地方。这个过程叫做“传递数据”。比如说,你在一个程序里输入了你的名字,然后这个名字需要被传到另一个地方去使用,这就是数据传递。
有几种方法可以实现数据传递。最常见的方式是使用“变量”。变量就像一个盒子,你可以把东西放进去,然后在需要的时候再拿出来。比如,你可以创建一个名为“用户名字”的变量,把你的名字放进去,这样在程序的其他地方就可以使用这个名字了。
除了变量,还有其他一些方法,比如“函数”。函数可以看作是一个小工具,它可以接收输入(比如你的名字),然后做一些事情(比如打印出来),最后可能还会给你一个结果。
总之,数据传递是编程中非常重要的一部分,它帮助我们在不同的地方使用相同的数据,让程序更灵活和强大。
In [39]: df = pd.DataFrame({'original':["John is a good friend", "Mary and Peter are going to marry", "Bond just met the Bond girl", "Chris is having dinner", "All Marys are here"], "people": ["John, Mary", "Peter, Mary", "Bond", '', "Mary"]})
In [40]: df
Out[40]:
original people
0 John is a good friend John, Mary
1 Mary and Peter are going to marry Peter, Mary
2 Bond just met the Bond girl Bond
3 Chris is having dinner
4 All Marys are here Mary
In [41]: df['result'] = df.apply(lambda row: sum((row['original'].count(p.strip()) for p in row['people'].split(',') if p), start=0), axis=1)
In [42]: df
Out[42]:
original people result
0 John is a good friend John, Mary 1
1 Mary and Peter are going to marry Peter, Mary 2
2 Bond just met the Bond girl Bond 2
3 Chris is having dinner 0
4 All Marys are here Mary 1