根据另一个字段中的字符串值在数据框中创建新字段

2024-04-28 14:54:53 发布

您现在位置:Python中文网/ 问答频道 /正文

下面是我的python数据帧表。我想要的结果在突出显示的黄色列中。在

enter image description here

下面是我要实现的代码逻辑:

  • 如果“Award”栏包含“Top IRA Advisor”,那么我希望“Industry_Recognition_Flag”字段显示“Recognition as Top IRA Advisor”。否则,我希望它是空白的。在

下面是我尝试过但没用的代码:

df_rfholder['Industry_Recognition_Flag'] = np.where(df_rfholder['Award'].str.contains('(?:Top IRA Advisor)', regex = True), 'Recognized as Top IRA Advisor', '')

非常感谢任何帮助!在

enter image description here


Tags: 数据代码dftopas逻辑空白flag
2条回答

这么简单的想法?在

>>> import pandas as pd
>>> data = {'Award': 8*['']+['2016 Top IRA Advisor', '', '2016 Top IRA Advisor']}
>>> df = pd.DataFrame(data)
>>> df
                   Award
0                       
1                       
2                       
3                       
4                       
5                       
6                       
7                       
8   2016 Top IRA Advisor
9                       
10  2016 Top IRA Advisor
>>> df['Desired Result']=df['Award'].apply(lambda x: 'Recognized as Top IRA Advisor' if x=='2016 Top IRA Advisor' else '')
>>> df
                   Award                 Desired Result
0                                                      
1                                                      
2                                                      
3                                                      
4                                                      
5                                                      
6                                                      
7                                                      
8   2016 Top IRA Advisor  Recognized as Top IRA Advisor
9                                                      
10  2016 Top IRA Advisor  Recognized as Top IRA Advisor

你可以用。结构匹配()... https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.match.html

下面是一个工作示例:

import datetime
import pandas as pd
import numpy as np

d = {'one' : pd.Series(['','2016 Top IRA Advisor','2016 Top IRA Advisor'], index=['a', 'b', 'c']), 'two' : pd.Series(['Recognized', 'Recognized', 'Recognized'], index=['a', 'b', 'c'])}

df = pd.DataFrame(d)

df["new"] = np.where(df['one'].str.match('.*Top IRA Advisor'), 'true', 'false')

print(df)

相关问题 更多 >