正则表达式,如何在12/24小时时间戳中去除所有非字母数字字符但保留冒号?

4 投票
6 回答
8000 浏览
提问于 2025-04-15 16:05

我有一个这样的字符串:

Today, 3:30pm - Group Meeting to discuss "big idea"

我该如何构建一个正则表达式,让它解析后返回:

Today 3:30pm Group Meeting to discuss big idea

我希望它能去掉所有不是字母或数字的字符,除了那些出现在12小时或24小时时间格式中的字符。

6 个回答

1

我猜你是想保留空格的,这段代码是用Python写的,但它是PCRE格式,所以应该可以在其他地方使用。

import re
x = u'Today, 3:30pm - Group Meeting to discuss "big idea"'
re.sub(r'[^a-zA-Z0-9: ]', '', x)

输出结果是:'今天 3:30pm 组会讨论大点子'

如果你想要一个稍微干净一点的结果(没有重复的空格)

import re
x = u'Today, 3:30pm - Group Meeting to discuss "big idea"'
tmp = re.sub(r'[^a-zA-Z0-9: ]', '', x)
re.sub(r'[ ]+', ' ', tmp)

输出结果是:'今天 3:30pm 组会讨论大点子'

2

这是关于Python编程的内容。

import string
punct=string.punctuation
s='Today, 3:30pm - Group Meeting:am to discuss "big idea" by our madam'
for item in s.split():
    try:
        t=time.strptime(item,"%H:%M%p")
    except:
        item=''.join([ i for i in item if i not in punct])
    else:
        item=item
    print item,

这是程序的输出结果。

$ ./python.py
Today 3:30pm  Group Meetingam to discuss big idea by our madam

# change to s='Today, 15:30pm - Group 1,2,3 Meeting to di4sc::uss3: 2:3:4 "big idea" on 03:33pm or 16:47 is also good'

$ ./python.py
Today 15:30pm  Group 123 Meeting to di4scuss3 234 big idea on 03:33pm or 1647 is also good

注意:这个方法应该改进一下,只在需要的时候检查时间是否有效(通过设置条件),不过我现在就先这样写了。

8
# this: D:DD, DD:DDam/pm 12/24 hr
re = r':(?=..(?<!\d:\d\d))|[^a-zA-Z0-9 ](?<!:)'

冒号前面必须至少有一个数字,后面至少要有两个数字,这样才能表示时间。其他的冒号都会被当作普通文本中的冒号。

它是怎么工作的

:              // match a colon
(?=..          // match but not capture two chars
  (?<!         // start a negative look-behind group (if it matches, the whole fails)
    \d:\d\d    // time stamp
  )            // end neg. look behind
)              // end non-capture two chars
|              // or
[^a-zA-Z0-9 ]  // match anything not digits or letters
(?<!:)         // that isn't a colon

然后当这个规则应用到这段搞笑的文字上:

Today, 3:30pm - Group 1,2,3 Meeting to di4sc::uss3: 2:3:4 "big idea" on 03:33pm or 16:47 is also good

...就会把它变成:

Today, 3:30pm  Group 123 Meeting to di4scuss3 234 big idea on 03:33pm or 16:47 is also good

撰写回答