我有一个正则表达式,它在给定的段落文本中检测出生日期。你知道吗
import re
dob = re.compile(r'(?:\bbirth\b|\bbirth(?:day|date)).{0,20}\n? \b((?:(?<!\:)(?<!\:\d)[0-3]?\d(?:st|nd|rd|th)?\s+(?:of\s+)?(?:jan\.?|january|feb\.?|february|mar\.?|march|apr\.?|april|may|jun\.?|june|jul\.?|july|aug\.?|august|sep\.?|september|oct\.?|october|nov\.?|november|dec\.?|december)|(?:jan\.?|january|feb\.?|february|mar\.?|march|apr\.?|april|may|jun\.?|june|jul\.?|july|aug\.?|august|sep\.?|september|oct\.?|october|nov\.?|november|dec\.?|december)\s+(?<!\:)(?<!\:\d)[0-3]?\d(?:st|nd|rd|th)?)(?:\,)?\s*(?:\d{4})?|\b[0-3]?\d[-\./][0-3]?\d[-\./]\d{2,4})\b',re.IGNORECASE | re.MULTILINE)
data = " Hi This is Goku and my birthday is on 6th Aug but to be clear it is on 1994-08-06."
l = dob.findall(data)
print(l)
o/p: ['6th Aug ']
我只想再添加一个特性,比如如果文本中存在YYYY-MM-DD格式的内容,那么也应该是出生日期。你知道吗
(其中YYYY-->;19XX-20XX,MM-->;01-12,DD-->;01-31)
例如:
data = " Hi This is Goku and my birthday is on 6th Aug but to be clear it is on 1994-08-06."
那么输出应该是
output: ['6th Aug ', '1994-08-06']
我在哪里可以在regex中添加部件,以便它也可以检测这种YYYY-MM-DD格式。??
这将检测YYYY-MM-DD
输出:
相关问题 更多 >
编程相关推荐