将字符串的特定部分转换为大写?

2024-05-12 18:28:26 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个DataFrame,我只想让字符串的特定部分变成大写,后面加下划线。你知道吗

|         TYPE       |  NAME  |
|-----------------------------|
| Contract Employee  | John   |
| Full Time Employee | Carol  |
| Temporary Employee | Kyle   |

我想把“合同”和“临时”这两个字改成大写,前后加下划线:

|         TYPE         |  NAME  |
|-------------------------------|
| _CONTRACT_ Employee  | John   |
| Full Time Employee   | Carol  |
| _TEMPORARY_ Employee | Kyle   |

我试过用上部结构()但这使得整个单元格都是大写的,我只寻找那些特定的单词。你知道吗

编辑:我应该提一下,如果有必要的话,有时这些词没有大写。它通常显示为temporary employee,而不是Temporary Employee。你知道吗


Tags: 字符串namedataframetimetypeemployeejohnfull
3条回答

通过使用replace和字典格式,这是一种简单易行的方法。你知道吗

refer pandas Doc for Series.replace

df["TYPE"] = df["TYPE"].replace({'Contract': '_CONTRACT_', 'Temporary': '_Temporary_'}, regex=True)

刚刚转载:

>>> df
                 TYPE   Name
0   Contract Employee   John
1  Full Time Employee  Carol
2  Temporary Employee   Kyle

>>> df["TYPE"] = df["TYPE"].replace({'Contract': '_CONTRACT_', 'Temporary': '_TEMPORARY_'}, regex=True)
>>> df
                   TYPE   Name
0   _CONTRACT_ Employee   John
1    Full Time Employee  Carol
2  _TEMPORARY_ Employee   Kyle

修改数据帧的东西(没有regex或任何东西):

l=['Contract','Temporary']
df['TYPE']=df['TYPE'].apply(lambda x: ' '.join(['_'+i.upper()+'_' if i in l else i for i in x.split()]))

joinsplit,在apply中。你知道吗

然后现在:

print(df)

是:

                   TYPE   NAME
0   _CONTRACT_ Employee   John
1    Full Time Employee  Carol
2  _TEMPORARY_ Employee   Kyle

这里有一个使用re.sub的选项:

def type_to_upper(match):
    return match.group(1).upper()

text = "Contract Employee"
output = re.sub(r'\b(Contract|Temporary)\b', type_to_upper, text)

编辑:

这与在pandas中应用的方法相同,也解决了关于要替换的不确定大写或小写单词的最新编辑:

测试数据帧:

                 TYPE   NAME
0   Contract Employee   John
1  Full Time Employee  Carol
2  Temporary Employee   Kyle
3   contract employee   John
4  Full Time employee  Carol
5  temporary employee   Kyle

解决方案:

def type_to_upper(match):
    return '_{}_'.format(match.group(1).upper())

df.TYPE = df.TYPE.str.replace(r'\b([Cc]ontract|[Tt]emporary)\b', type_to_upper)

结果:

df 
                   TYPE   NAME
0   _CONTRACT_ Employee   John
1    Full Time Employee  Carol
2  _TEMPORARY_ Employee   Kyle
3   _CONTRACT_ employee   John
4    Full Time employee  Carol
5  _TEMPORARY_ employee   Kyle

请注意,这仅用于解决OPs请求中定义的这两种情况。对于完全不区分大小写的情况,它更简单:

df.TYPE = df.TYPE.str.replace(r'\b(contract|temporary)\b', type_to_upper, case=False)

相关问题 更多 >