Pandas：用以制表符结尾的字符串替换正则表达式无效

-1 投票

3 回答

46 浏览

提问于 2025-04-13 12:54

我有一个这样的数据表：

df = pd.DataFrame({'Depth':['7500', '7800', '8300', '8500'],
                'Gas':['25-13 PASON', '9/8 PASON', '19/14', '56/26'],
                'ID':[1, 2, 3, 4]})

我想在“Gas”这一列的每个值后面加上“PASON”，前提是这些值后面还没有这个词。这样最终的效果应该是这样的：

我本来以为可以用正则表达式（Regex）简单替换一下，但结果不行。以下是我的代码：

df['Gas'] = df['Gas'].replace(to_replace =r'(\d+/\d+)(\t)(\d+)', value = r'\1 PASON\2', regex = True)

我在正则表达式检查器里测试过这个正则，它在那儿运行得很好，但当我把它放到Pandas里就不行了。我漏掉了什么呢？

谢谢！

数据处理字符串替换数据清洗数据分析 pandas 数据表数据科学 regex

3 个回答

使用一个简单的原地修改方法，结合 str.endswith 和布尔索引：

df.loc[~df['Gas'].str.endswith('PASON'), 'Gas'] += ' PASON'

输出结果：

  Depth          Gas  ID
0  7500  25-13 PASON   1
1  7800    9/8 PASON   2
2  8300  19/14 PASON   3
3  8500  56/26 PASON   4

回答于 2025-04-13 由 Python大师

分享举报

简单地这样做

import pandas as pd

df = pd.DataFrame({
    'Depth': ['7500', '7800', '8300', '8500'],
    'Gas': ['25-13 PASON', '9/8 PASON', '19/14', '56/26'],
    'ID': [1, 2, 3, 4]
})

df['Gas'] = df['Gas'].apply(lambda x: x if x.endswith('PASON') else f"{x} PASON")

print(df)

这样就会得到

  Depth          Gas  ID
0  7500  25-13 PASON   1
1  7800    9/8 PASON   2
2  8300  19/14 PASON   3
3  8500  56/26 PASON   4

如果你只想在有数字的情况下添加PASON

import pandas as pd
import re

df = pd.DataFrame({
    'Depth': ['7500', '7800', '8300', '8500', '40404'],
    'Gas': ['25-13 PASON', 'Something PASON', 'AnotherValue', '12/34', 'NoDigitsHere'],
    'ID': [1, 2, 3, 4, 5]
})


df['Gas'] = df['Gas'].apply(lambda x: f"{x} PASON" if re.search(r'\d', x) and not x.endswith('PASON') else x)

print(df)

这样就会得到

   Depth              Gas  ID
0   7500      25-13 PASON   1
1   7800  Something PASON   2
2   8300     AnotherValue   3
3   8500      12/34 PASON   4
4  40404     NoDigitsHere   5

回答于 2025-04-13 由 Python大师

分享举报

你可以简单地检查一下字符串中是否包含'PASON'这个子串，如果没有的话就加上：

df['Gas'] = df['Gas'].where(df['Gas'].str.contains('PASON'), df['Gas'] + ' PASON')

这样就可以了

  Depth          Gas  ID
0  7500  25-13 PASON   1
1  7800    9/8 PASON   2
2  8300  19/14 PASON   3
3  8500  56/26 PASON   4

回答于 2025-04-13 由 Python大师

分享举报

Pandas：用以制表符结尾的字符串替换正则表达式无效

3 个回答

撰写回答