删除字符串中由逗号和双引号包围的逗号/Python

2024-05-15 04:42:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我在stackoverflow上发现了一些类似的主题,但我对Python和Reg Exps还比较陌生。

我有一根绳子

,"Completely renovated in 2009, the 2-star Superior Hotel Ibis Berlin Messe, with its 168 air-conditioned rooms, is located right next to Berlin's ICC and exhibition center. All rooms have Wi-Fi, and you can surf the Internet free of charge at two iPoint-PCs in the lobby. We provide a 24-hour bar, snacks and reception service. Enjoy our breakfast buffet from 4am to 12pm on the 8th floor, where you have a fantastic view across Berlin. You will find free car parking directly next to the hotel.",

模式应该是:comma, double quote|any text with commas |double quote, comma。 我需要用双引号替换逗号,例如用@字符。 我应该使用哪个reg exp模式?

我试过这个:

r',"([.*]*,[.*]*)*",' 

有不同的变化,但不起作用。

谢谢你的回答,问题解决了。


Tags: andthetoinyoufreehavewith
3条回答

如果您需要做的只是用@character替换逗号,那么您应该考虑使用str_replace而不是regex。

str_a = "Completely renovated in 2009, the 2-star Superior Hotel Ibis Berlin Messe, with its 168 air-conditioned rooms, is located right next to Berlin's ICC and exhibition center. All rooms have Wi-Fi, and you can surf the Internet free of charge at two iPoint-PCs in the lobby. We provide a 24-hour bar, snacks and reception service. Enjoy our breakfast buffet from 4am to 12pm on the 8th floor, where you have a fantastic view across Berlin. You will find free car parking directly next to the hotel."

str_a = str_a.replace('","', '@') #commas inside double quotes
str_a = str_a.replace(',', '@') #replace just commas

print str_a

编辑:或者,您可以列出要替换的内容,然后循环查看并执行替换。例如:

to_replace = ['""', ',', '"']

str_a = "Completely renovated in 2009, the 2-star Superior Hotel Ibis Berlin Messe, with its 168 air-conditioned rooms, is located right next to Berlin's ICC and exhibition center. All rooms have Wi-Fi, and you can surf the Internet free of charge at two iPoint-PCs in the lobby. We provide a 24-hour bar, snacks and reception service. Enjoy our breakfast buffet from 4am to 12pm on the 8th floor, where you have a fantastic view across Berlin. You will find free car parking directly next to the hotel."

for a in to_replace:
    str_a = str_a.replace(a, '@')

print str_a

嗯,你的regex很可疑。

,"([.*]*,[.*]*)*",

[.*]将匹配文字点或星号(.*成为字符类中的文字)。

此外,如果这实际上可以匹配字符串中的某些内容,则只能替换一个逗号,因为字符串的其余部分(包括逗号)将由正则表达式使用,并且一旦使用,就不能再次替换,除非运行循环,直到没有更多的逗号可替换。

你能用re.sub来代替那些逗号的方法是使用lookarounds(你可以用谷歌搜索,我相信有足够的关于它们的文档)。如果只有一对双引号,则可以确保只替换逗号后跟一个双引号:

,(?=[^"]*"[^"]*$)

[^"]表示不是双引号的字符。[^"]*意味着这将重复0次或更多次。

$表示行尾。

现在,lookahead(?= ... )确保逗号前面有什么内容。

请参阅与here匹配的逗号。

在那之后,你可以简单地用你想要的值替换逗号。

str = re.sub(r',(?=[^"]*"[^"]*$)', '@', str)

但是,如果有多个双引号,则应确保前面有奇数个双引号。这可以通过使用regex来完成:

,(?=[^"]*"[^"]*(?:"[^"]*"[^"]*)*$)

顺便说一下,(?: ... )是一个非捕获组。

你可以试试这个(虽然很致命)。这里的诀窍是,一对双引号中的任何字符后面都有奇数个双引号,当然,假设您的双引号是平衡的:

s = 'some comma , outside "Some comma , inside" , "Completely , renovated in 2009",'

import re
s = re.sub(r',(?=[^"]*"(?:[^"]*"[^"]*")*[^"]*$)', "@", s)
print s

输出

some comma , outside "Some comma @ inside" , "Completely @ renovated in 2009",

相关问题 更多 >

    热门问题