格式化列的内容。删除尾随文本和数字

2024-04-20 15:42:44 发布

您现在位置：Python中文网/ 问答频道 /正文

8235

网友

男 | 程序猿一只，喜欢编程写python代码。

我已经使用BeautifulSoup和pandas创建了一个csv，其中的列包含错误代码和相应的错误消息。你知道吗

在格式化之前，列看起来像这样

-132456ErrorMessage
-3254Some other Error 
-45466You've now used 3 different examples. 2 more to go. 
-10240 This time there was a space.    
-1232113That was a long number.

我成功地分离出了如下代码：

dfDSError['text']  = dfDSError['text'].map(lambda x: x.lstrip('-0123456789'))

这正是我想要的。你知道吗

但我一直在努力找出解决这些密码的办法。你知道吗

我试过这个：

dfDSError['codes'] = dfDSError['codes'].replace(regex=True,to_replace= r'\D',value=r'')

但这会将错误消息中的数字附加到代码的末尾。所以对于上面的第三个例子，我得到的是4546632，而不是45466。另外，我想保留前面的减号。你知道吗

我想也许我可以把rstrip（）和regex结合起来，找到一个非数字或一个空格旁边的空格，然后删除其他所有内容，但我一直没有成功。你知道吗

for_removal = re.compile(r'\d\D*')
dfDSError['codes']  = dfDSError['codes'].map(lambda x: x.rstrip(re.findall(for_removal,x)))                         
TypeError: rstrip arg must be None, unicode or str

有什么建议吗？谢谢！你知道吗

Tags： to lambda 代码 text 消息 map 错误数字

1条回答

网友

1楼 · 发布于 2024-04-20 15:42:44

您可以使用^{}：

dfDSError[['code','text']] = dfDSError.text.str.extract('([-0-9]+)(.*)', expand=True)
print (dfDSError)
                                                text      code
0                                       ErrorMessage   -132456
1                                  Some other Error      -3254
2  You've now used 3 different examples. 2 more t...    -45466
3                   This time there was a space.        -10240
4                            That was a long number.  -1232113

格式化列的内容。删除尾随文本和数字

相关问题更多 >

编程相关推荐

热门问题

热门文章

格式化列的内容。删除尾随文本和数字

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >