自动为收件箱子目录中的邮件创建dovecot.sieve规则的脚本

2 投票
1 回答
2507 浏览
提问于 2025-04-28 04:52

在这个很棒的网站上浏览和使用解决方案一段时间后,我终于决定参与进来了。

我对自己想要的东西有一个比较清晰的概念,但我在寻找最好的实现方式。

我想要什么?:

我已经在树莓派上设置了一个邮件服务器,使用得很好。这个服务器由dovecot服务器和一些sieve过滤器组成,用来把我很多的邮箱地址分到不同的收件箱子目录里。还有一个垃圾邮件过滤器,每晚通过一个脚本学习区分正常邮件和垃圾邮件。(简单来说,它被教导垃圾邮件在垃圾箱文件夹里,而其他文件夹里是正常邮件)

我想为一个专门的“新闻通讯”文件夹复制这种行为。这个文件夹里没有需要立即查看或报告的紧急消息。

我的计划是手动把邮件放到“新闻”文件夹里,然后让一个脚本每天扫描这个文件夹一次。如果它发现某个地址的邮件没有对应的sieve规则,就应该创建一个规则,自动把这个地址的邮件放到“新闻”文件夹里。

实现步骤?:

  • 首先,脚本需要扫描现有的.dovecot.sieve文件,从“新闻文件夹”规则中提取地址,放到一个单独的文件或对象中进行比较。

    /*Example of a sieve filter:*/
    
    require "fileinto";
    
     /* Global Spam Filter */
    if anyof (header :contains "subject" "*SPAM*",
              header :contains "X-Spam-Flag" "YES" ) {
      fileinto "Junk";
      stop;
    }
    
    /* LAN Emails Filter */
      elsif address :is "to" "lan@docbrown.pi" {
      fileinto "INBOX.Lokal";
      stop;
    }
    
    /* Newsletter Filter */
      elsif anyof (address :is "from" "newsletter@example.com",
                   address :is "from" "news@yahoo.de",
                   address :is "from" "info@mailbox.de",
                   address :is "from" "something@somewhere.de") {
      fileinto "INBOX.Newsletter";
      stop;
    }
    
     /* gmail Account Filter */
      elsif address :is "to" "docbrown@gmail.com" {
      fileinto "INBOX.gmail";
      stop;
    }
    
     /* Yahoo Account Filter */
      elsif address :is "to" "docbrown@yahoo.de" {
      fileinto "INBOX.yahoo";
      stop;
    }
    
      else {
      # The rest goes into INBOX
      # default is "implicit keep", we do it explicitly here
      keep;
    }
    
  • 然后,它需要处理“新闻”文件夹中的所有邮件,搜索邮件中的“发件人:”字段和尖括号里的邮箱地址。

    Date: Mon, 4 Nov 2013 16:38:30 +0100 (CET)
    From: Johannes Ebert - Redaktion c't <infoservice@heise.de> 
    To: docbrown@example.de
    
  • 把这些地址和从sieve文件中提取的地址进行比较,如果这个地址没有过滤规则
    (例如,没在列表里找到),就为它创建一个规则(或者直接把它加到提取的地址里)

  • 处理完所有邮件后,会为“新闻”文件夹创建一个新的规则集,
    用提取的邮箱地址文件替换现有的dovecot.sieve文件(旧的文件会先备份,以防万一)
  • 可能还需要重启dovecot,以便读取新的规则?

目前的进展:

我尝试通过简单使用bash命令和工具来实现这个目标。这样让我接近了可以从dovecot.sieve文件中提取邮箱地址的程度,但对我来说这有点复杂,而且花了不少时间。

#!/bin/sh

cp /home/mailman/.dovecot.sieve /home/mailman/autosieve/dovecot.sieve_`date +backup_%d%m%Y`
#echo "" > search.txt

X=grep -n "Newsletter Filter" /home/mailman/.dovecot.sieve #get rule start line number, some magic needs to happen here to just apply the numbers and not the full output by grep
Y=grep -n "INBOX.Newsletter" /home/mailman/.dovecot.sieve #get rule end line number
$X++  #increment to go into the next line
$Y--  #decrement to go into the previous line
sed -n ‘$X,$Yp’ /home/mailman/.dovecot.sieve > /home/mailman/search.txt  #copy lines into separate search_file
less /home/mailman/search.txt | awk -F '"' '{ if ($2 != "") print $4 }' > /home/mailman/adressen.txt # filter addresses and export to separate file

所以我在想,是否可以用python来更简单地实现这个目标。我在另一个树莓派项目中尝试过,但没有时间深入学习python。

所以我希望能得到一些帮助、建议或指引。

到目前为止,我找到了一些类似问题的解决方案(关于第一部分的提取),但我没能完全适应,或者因为无法执行脚本而犯了一些错误。

#!/usr/bin/python

file = open("dovecot.sieve", "r")

rule = {}
current_rule = None

for line in file:
    line = line.split()

    if (line[2] == "INBOX.Newsletter"):
        break
    if (line[1] == "/* Newsletter Filter */"):
        current_rule = rule.setdefault('Newsletter', [])
        continue
    if (line[5] == "from"):
        current_rule.append(line[6])
        continue
    if (line[3] == "from"):
        current_rule.append(line[4])
        continue


file.close()

# Now print out all the data
import pprint
print "whole array"
print "=============================="
pprint.pprint(rule)
print 
print "addresses found"
print "=========================="
pprint.pprint(rule['Newsletter'])

有没有人能推荐一个适合python的IDE,带调试功能等等?我想到Eclipse,或者有没有其他的(可能不那么占资源)?

暂无标签

1 个回答

0

好的,我有一些空闲时间来解决我自己的问题。我查了一些资料,读了一些代码片段,并在Eclipse的Pydev中进行了测试。

现在我把这个脚本设置成了晚上定时运行。

这个脚本做什么呢?

它会收集dovecot.sieve文件中所有的电子邮件地址(主要是“Newsletter”规则里的那些)。然后,它会查看INBOX.Newsletter文件夹,找出哪些电子邮件地址没有注册,通过和收集到的地址进行对比。如果发现了新的地址,它会先保存一份旧的sieve文件,然后重新写入现有的文件。新的电子邮件地址会被添加到“Newsletter”规则中,这样这些邮件就会被转发到指定的Newsletter文件夹里。

#!/usr/bin/python2.7

import os, sys
#Get the already configured email senders...
addresses = {}
current_addresses = None

with open("/home/postman/.dovecot.sieve", "r") as sieveconf:
    for line in sieveconf:
        if "INBOX.Newsletter" in line:
            break

        if "Newsletter Filter" in line:
            current_addresses = addresses.setdefault('found', [])
            continue

        if "from" in line and current_addresses != None:
            line = line.split('"')

            if (len(line) > 4) and (line[1] == "from"):
                current_addresses.append(line[3])

                continue

#save the count for later
addr_num = 0
addr_num = len(addresses['found'])

#iterate all files in all sub-directories of INBOX.Newsletter
for root, _,files in os.walk("/home/postman/Mails/.INBOX.Newsletter"):
    #for each file in the current directory
    for emaildir in files:
        #open the file
        with open(os.path.join(root, emaildir), "r") as mail:
            #scan line by line
            for line in mail:
                if "From: " in line:
                    #arm boolean value for adding to list
                    found_sw = False
                    #extract substring from line
                    found = ((line.split('<'))[1].split('>')[0])
                    #compare found address with already existing addresses in dictionary
                    for m_addr in addresses['found']:
                        if m_addr == found:
                            #remember if the address is already in the dictionary
                            found_sw = True
                            break

                    if not found_sw:
                        #if the address is not included in the dictionary put it there
                        current_addresses.append(found)
                    break


# Now print out all the data
#import pprint
#print "addresses found:"
#print "=========================="
#pprint.pprint(addresses['found'])
#print
#print "orig_nmbr_of_addresses:" , addr_num
#print "found_nmbr_of_addresses:", len(addresses['found'])
#print "not_recorded_addresses:", (len(addresses['found']) - (addr_num))

#Compare if the address count has changed
if addr_num == len(addresses['found']):
    #exit the script since no new addresses have been found
    sys.exit
else:
    #copy original sieve file for backup
    import datetime
    from shutil import copyfile
    backupfilename = '.backup_%s.sieve'% datetime.date.today()
    copyfile('dovecot.sieve', backupfilename)

    #edit the existing sieve file and add the new entries
    import fileinput
    #open file for in place editing
    for line in fileinput.input('dovecot.sieve', inplace=1):
        #if the line before the last entry is reached
        if addresses['found'][(addr_num - 2)] in line:
            #print the line
            print line,
            #put new rules before the last line (just to avoid extra handling for last line, since the lines before are rather identical)
            for x in range (addr_num, (len(addresses['found']))):
                print '               address :is "from" "%s",'% addresses['found'][x]
        else:
            #print all other lines
            print line,

撰写回答