<p>如果<code>Approve</code>/<code>Approved</code>/<code>Approval</code>是否在电子邮件的主题或正文中无关紧要,您可以这样做:</p>
<pre><code>import re
text = '''From: Jerrmy Bret <jeremy.brett@mnop.com>
To: Jonathan Small <j.small@xyz.com>
Date: 21 Sep 2019
Subject: Stuff
FYI...
From: Keven Koster <keve.koster@mnop.com>
To: Jerrmy Bret <jeremy.brett@mnop.com>
Date: 21 Sep 2019
Subject: Approval Required for Travel
Can't Approve as Ruth's approval is required
From: Jerrmy Bret <jeremy.brett@mnop.com>
To: Keven Koster <keve.koster@mnop.com>
Date: 21 Sep 2019
Subject: Approval Required for Travel
ok thanks Keven, will talk to Ruth
'''
email_regex = re.compile(
r'(From:(?:(?!From:).)+)',
re.DOTALL|re.MULTILINE
)
approval_regex = re.compile(
r'approv(?:e|ed|al)',
re.IGNORECASE
)
approved_emails = [
email for email in email_regex.findall(text)
if approval_regex.search(email)
]
print(approved_emails)
# output
[
"From: Keven Koster <keve.koster@mnop.com>\nTo: Jerrmy Bret <jeremy.brett@mnop.com>\nDate: 21 Sep 2019\nSubject: Approval Required for Travel\n\nCan't Approve as Ruth's approval is required\n\n",
'From: Jerrmy Bret <jeremy.brett@mnop.com>\nTo: Keven Koster <keve.koster@mnop.com>\nDate: 21 Sep 2019\nSubject: Approval Required for Travel\n\nok thanks Keven, will talk to Ruth\n'
]
</code></pre>
<p>如果有关系的话,你可以把<code>approval_regex</code>改成这样:</p>
<pre><code>approval_regex = re.compile(
r'Subject:.+\n.*approv(?:e|ed|al)',
re.IGNORECASE|re.DOTALL|re.MULTILINE
)
</code></pre>