如何替换整个csv文件中的特定单词？

infoID messages 111 we need to fix the car mag but we can't 113 we need a shf to perform eng change 115 gr is needed to change 116 bat needs change 117 car towed for ext change 118 car ml is high . .

2条回答

网友

1楼 · 编辑于 2024-04-25 10:27:27

将文本文件作为一个系列读入

s

0    mag:magnitude
1        shf:shaft
2          gr:gear
3      bat:battery
4      ext:exhaust
5       ml:mileage
Name: 0, dtype: object

在冒号上拆分并将序列转换为字典映射键以替换：

dict(s.str.split(':').tolist())

# {'bat': 'battery',
#  'ext': 'exhaust',
#  'gr': 'gear',
#  'mag': 'magnitude',
#  'ml': 'mileage',
#  'shf': 'shaft'}

使用此命令对regex=True执行^{}操作：

df['messages'].replace(dict(s.str.split(':').tolist()), regex=True)

0    we need to fix the car magnitude but we can't
1            we need a shaft to perform eng change
2                         gear is needed to change
3                             battery needs change
4                     car towed for exhaust change
5                              car mileage is high
Name: messages, dtype: object

请注意，如果这些是严格意义上的整词替换，则可以通过将关键字字符串转换为使用词边界的正则表达式来扩展此解决方案。为了更好地测量，也要对字符串进行转义：

import re

mapping = {fr'\b{re.escape(k)}\b': v for k, v in s.str.split(':').tolist()}
df['messages'].replace(mapping, regex=True)

0    we need to fix the car magnitude but we can't
1            we need a shaft to perform eng change
2                         gear is needed to change
3                             battery needs change
4                     car towed for exhaust change
5                              car mileage is high
Name: messages, dtype: object

网友

2楼 · 编辑于 2024-04-25 10:27:27

另一种使用pd.Series.apply的方法：

d = dict(i.split(':') for i in d.split('\n'))
#{'bat': 'battery',
# 'ext': 'exhaust',
# 'gr': 'gear',
# 'mag': 'magnitude',
# 'ml': 'mileage',
# 'shf': 'shaft'}

df['messages'].apply(lambda x : ' '.join(d.get(i, i) for i in x.split()), 1)

输出：

0    we need to fix the car magnitude but we can't
1            we need a shaft to perform eng change
2                         gear is needed to change
3                             battery needs change
4                     car towed for exhaust change
5                              car mileage is high
Name: messages, dtype: object

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何替换整个csv文件中的特定单词？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >