使用Python将单个邮件从邮件跟踪中分离出来

2024-05-13 04:17:17 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在使用Python处理与电子邮件相关的PoC。系统接收包含邮件跟踪的电子邮件。我想将单独的电子邮件从邮件跟踪中分离出来并进行处理。问题是我没有得到正确的代码或库来做这件事。谁能帮忙吗。在

例如。 系统收到如下电子邮件

SUBJECT: RE: CALL ID #  98670786 CALL ID #  98983051 DATE SENT: 23-JANUARY-2017 TIME SENT: 17:56:09 PM SENDER ID: abc@xyz.COM MESSAGE TEXT: DEAR SIR, 

Please check and let me know 
REGARDS 

XXXXXXX 
00000015 

FROM: company;sender@company.COM 
SENT:MON, 23 JAN 2017 16:04:26 +0530 
TO: abc@xyz.COM 
SUBJECT: RE: RE: CALL ID #  98670786 CALL ID #  98983051 
DEAR MR. XXXXX, 
> 
>WE REFER TO YOUR EMAIL DATED 20/01/2017 FOR THE company. 
> 
>WE HEREBY INFORM YOU THAT WE HAVE CHECKED WITH OUR TOUCH POINT AND THEY HAVE CONFIRMED THAT THE Things HAS BEEN delivered TO YOU AND WE WOULD KINDLY REQUEST YOU TO CHECK YOUR at your end FOR BETTER ASSISTANCE. 
> 
>  
> 
>YOURS SINCERELY, 
> 
>sender, 
>Company
>---------------------------------------------------------- 
>Disclaimers: 
>adsadsadsadadasdada 
>daadsadadsadsadsa.  
>REGISTERED ADDRESS:-sadsadsadsadsadsadsadasdsadsadsadsa  
>---------------ORIGINAL MESSAGE------------------ 
>SUBJECT: CALL ID # 98418758 CALL ID # 98510240 CALL ID # 98670786 DATE SENT: 20-JANUARY-2017 TIME SENT: 11:06:38 AM SENDER ID: abc@xyz.COM MESSAGE TEXT: DEAR SIR, 
> 
>BY WHEN WILL THIS things WILL BE delivered TO Me. 
> 
>REGARDS 
> 
>XXXXXXX 
> 
>00000015 
> 
>FROM: "company"sender@company.COM 
>SENT:FRI, 20 JAN 2017 10:44:16 +0530 
>TO: abc@xyz.COM 
>SUBJECT: RE: RE: CALL ID # 98510240 CALL ID # 98670786 
>DEAR MR. XXXXX, WE APPRECIATE YOUR TIME AND PATIENCE AND APOLOGIZE FOR THE LATE RESPONSE. 
>> 
>>WE REFER TO YOUR EMAIL DATED 11/01/2017N FOR company NUMBER 00000015. WITH REGARDS TO YOUR CONCERN WE HEREBY INFORM YOU THAT TILL DATE YOUR things is pending with us.  
>>TRUST THIS CLARIFIES YOUR CONCERN. YOURS SINCERELY, 
>> 
>>Sender. 
>>company 
>>---------------------------------------------------------- 
>>CALL CENTER TIMINGS: 10.00 A.M. TO 7.00 P.M MONDAY TO SATURDAY (EXCEPT NATIONAL HOLIDAYS) 

上面的邮件应该分成以下四部分

(一)

^{pr2}$

(二)

^{3}$

(三)

>---------------ORIGINAL MESSAGE------------------ 
>SUBJECT: CALL ID # 98418758 CALL ID # 98510240 CALL ID # 98670786 DATE SENT: 20-JANUARY-2017 TIME SENT: 11:06:38 AM SENDER ID: abc@xyz.COM MESSAGE TEXT: DEAR SIR, 
> 
>BY WHEN WILL THIS things WILL BE delivered TO Me. 
> 
>REGARDS 
> 
>XXXXXXX 
> 
>00000015 
> 

(四)

>FROM: "company"sender@company.COM 
>SENT:FRI, 20 JAN 2017 10:44:16 +0530 
>TO: abc@xyz.COM 
>SUBJECT: RE: RE: CALL ID # 98510240 CALL ID # 98670786 
>DEAR MR. XXXXX, WE APPRECIATE YOUR TIME AND PATIENCE AND APOLOGIZE FOR THE LATE RESPONSE. 
>> 
>>WE REFER TO YOUR EMAIL DATED 11/01/2017N FOR company NUMBER 00000015. WITH REGARDS TO YOUR CONCERN WE HEREBY INFORM YOU THAT TILL DATE YOUR things is pending with us.  
>>TRUST THIS CLARIFIES YOUR CONCERN. YOURS SINCERELY, 
>> 
>>Sender. 
>>company 
>>---------------------------------------------------------- 
>>CALL CENTER TIMINGS: 10.00 A.M. TO 7.00 P.M MONDAY TO SATURDAY (EXCEPT NATIONAL HOLIDAYS) 

——已编辑------ 经过多次排列,我得到了以下代码。在

startMsgPatter=
re.compile((\W*ORIGINAL\s*MESSAGE|\W*FROM\s*:|\W*ON.*WROTE\s*:)")
def sperateEmails(callDesc):
    itr = startMsgPatter.finditer(callDesc)
    blockStart = 0
    emails = []

    while True:
        m = next(itr,None)
        if not m:
            break
        blockEnd = m.start()
        if blockStart >= blockEnd:
            continue 
        emailPart = callDesc[blockStart:blockEnd]
        emails.append(emailPart)
        blockStart = blockEnd
        emails.append(callDesc[blockStart:len(callDesc)])
    return emails

它是有效的,但我必须继续寻找模式,指示开始和结束的邮件,并更新它。按照我的说法,邮件追踪应该遵循一定的模式。有没有人写过一个考虑到大多数这种模式的代码,请分享。在


Tags: torecomidmessageforyourcall
1条回答
网友
1楼 · 发布于 2024-05-13 04:17:17

您可以使用函数split()

示例:

"first mail separator second mail".split(" separator ")

将输出:

["first mail", "second mail"]

你只需要知道要用哪个分隔符。注意,分隔符将从结果中删除,但如果需要,可以在以后重新赋值。在

在您的示例中,似乎所有消息都由字符串分隔

^{pr2}$

或者

"FROM"

我建议你先把第一个分开,然后再分第二个,就像这样:

all = [] # Splitted messages will be stored here
# mail_trail is the content of your mail trail
sep = mail_trail.split("       -ORIGINAL MESSAGE         ")
for msg in sep:
    sep2 = msg.split("FROM")
    if len(sep2) == 2: # has splitted
         sep2[1] = "FROM" + sep2[1] # reappend the FROM since you need it
    all.extend(sep2) # Add the messages in the array

这会让你走上正轨。在

相关问题 更多 >