Python：从包含特定单词的文本中选择一个子内容 - 问答 - Python中文网

Python：从包含特定单词的文本中选择一个子内容

2024-06-06 19:29:33 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我的数据集包含一个包含大量文本的列。该列中的每一行都包含多个句子

我想在包含“牙医”一词的文本中搜索（子）句子，并删除所有其他句子。然后保存正确的文本

当一行包含以下文字时：“我的牙医很棒。但是助手很糟糕。我只是喜欢牙医。”

结果应该是：“我的牙医很棒，我就是喜欢这个牙医。”

这是到目前为止我的脚本，df是我的数据集：

sentence= df['columnwithtext']
for subsentence in sentence.split("."):
    if "dentist" in subsentence:
        print(subsentence)

然而，当我运行这个脚本时，我什么也得不到，甚至连一个错误都没有……缺少什么

然后我尝试了这个脚本：

df_dentist=df[df['columnwithtext'].str.contains("dentist")]
df_dentist

但是我得到了一整行的句子，其中有“牙医”这个词，还有我不需要的句子

我做错了什么？提前谢谢

Tags：数据 in 文本脚本 df for 助手 sentence

1条回答

网友

1楼 · 发布于 2024-06-06 19:29:33

也许这就是你要找的（findall with join）

df = pd.DataFrame(["My dentist is great. However the assistent is horrible. I just love the dentist.",
                   "No dentist is good. Every dentist is bad. This is not correct",
                   "Dentist or not. dentist is a dentist."], columns = ['dental'])

df.dental.str.findall(r'([^\.]+dentist[^\.]*\.)').apply(''.join)

这将提供以下输出：

0    My dentist is great. I just love the dentist.
1        No dentist is good. Every dentist is bad.
2                            dentist is a dentist.
Name: dental, dtype: object

相关问题更多 >

编程相关推荐

热门问题

热门文章