自动为收件箱子目录中的邮件创建dovecot.sieve规则的脚本
在这个很棒的网站上浏览和使用解决方案一段时间后,我终于决定参与进来了。
我对自己想要的东西有一个比较清晰的概念,但我在寻找最好的实现方式。
我想要什么?:
我已经在树莓派上设置了一个邮件服务器,使用得很好。这个服务器由dovecot服务器和一些sieve过滤器组成,用来把我很多的邮箱地址分到不同的收件箱子目录里。还有一个垃圾邮件过滤器,每晚通过一个脚本学习区分正常邮件和垃圾邮件。(简单来说,它被教导垃圾邮件在垃圾箱文件夹里,而其他文件夹里是正常邮件)
我想为一个专门的“新闻通讯”文件夹复制这种行为。这个文件夹里没有需要立即查看或报告的紧急消息。
我的计划是手动把邮件放到“新闻”文件夹里,然后让一个脚本每天扫描这个文件夹一次。如果它发现某个地址的邮件没有对应的sieve规则,就应该创建一个规则,自动把这个地址的邮件放到“新闻”文件夹里。
实现步骤?:
首先,脚本需要扫描现有的.dovecot.sieve文件,从“新闻文件夹”规则中提取地址,放到一个单独的文件或对象中进行比较。
/*Example of a sieve filter:*/ require "fileinto"; /* Global Spam Filter */ if anyof (header :contains "subject" "*SPAM*", header :contains "X-Spam-Flag" "YES" ) { fileinto "Junk"; stop; } /* LAN Emails Filter */ elsif address :is "to" "lan@docbrown.pi" { fileinto "INBOX.Lokal"; stop; } /* Newsletter Filter */ elsif anyof (address :is "from" "newsletter@example.com", address :is "from" "news@yahoo.de", address :is "from" "info@mailbox.de", address :is "from" "something@somewhere.de") { fileinto "INBOX.Newsletter"; stop; } /* gmail Account Filter */ elsif address :is "to" "docbrown@gmail.com" { fileinto "INBOX.gmail"; stop; } /* Yahoo Account Filter */ elsif address :is "to" "docbrown@yahoo.de" { fileinto "INBOX.yahoo"; stop; } else { # The rest goes into INBOX # default is "implicit keep", we do it explicitly here keep; }
然后,它需要处理“新闻”文件夹中的所有邮件,搜索邮件中的“发件人:”字段和尖括号里的邮箱地址。
Date: Mon, 4 Nov 2013 16:38:30 +0100 (CET) From: Johannes Ebert - Redaktion c't <infoservice@heise.de> To: docbrown@example.de
把这些地址和从sieve文件中提取的地址进行比较,如果这个地址没有过滤规则
(例如,没在列表里找到),就为它创建一个规则(或者直接把它加到提取的地址里)- 处理完所有邮件后,会为“新闻”文件夹创建一个新的规则集,
用提取的邮箱地址文件替换现有的dovecot.sieve文件(旧的文件会先备份,以防万一) - 可能还需要重启dovecot,以便读取新的规则?
目前的进展:
我尝试通过简单使用bash命令和工具来实现这个目标。这样让我接近了可以从dovecot.sieve文件中提取邮箱地址的程度,但对我来说这有点复杂,而且花了不少时间。
#!/bin/sh
cp /home/mailman/.dovecot.sieve /home/mailman/autosieve/dovecot.sieve_`date +backup_%d%m%Y`
#echo "" > search.txt
X=grep -n "Newsletter Filter" /home/mailman/.dovecot.sieve #get rule start line number, some magic needs to happen here to just apply the numbers and not the full output by grep
Y=grep -n "INBOX.Newsletter" /home/mailman/.dovecot.sieve #get rule end line number
$X++ #increment to go into the next line
$Y-- #decrement to go into the previous line
sed -n ‘$X,$Yp’ /home/mailman/.dovecot.sieve > /home/mailman/search.txt #copy lines into separate search_file
less /home/mailman/search.txt | awk -F '"' '{ if ($2 != "") print $4 }' > /home/mailman/adressen.txt # filter addresses and export to separate file
所以我在想,是否可以用python来更简单地实现这个目标。我在另一个树莓派项目中尝试过,但没有时间深入学习python。
所以我希望能得到一些帮助、建议或指引。
到目前为止,我找到了一些类似问题的解决方案(关于第一部分的提取),但我没能完全适应,或者因为无法执行脚本而犯了一些错误。
#!/usr/bin/python
file = open("dovecot.sieve", "r")
rule = {}
current_rule = None
for line in file:
line = line.split()
if (line[2] == "INBOX.Newsletter"):
break
if (line[1] == "/* Newsletter Filter */"):
current_rule = rule.setdefault('Newsletter', [])
continue
if (line[5] == "from"):
current_rule.append(line[6])
continue
if (line[3] == "from"):
current_rule.append(line[4])
continue
file.close()
# Now print out all the data
import pprint
print "whole array"
print "=============================="
pprint.pprint(rule)
print
print "addresses found"
print "=========================="
pprint.pprint(rule['Newsletter'])
有没有人能推荐一个适合python的IDE,带调试功能等等?我想到Eclipse,或者有没有其他的(可能不那么占资源)?
1 个回答
好的,我有一些空闲时间来解决我自己的问题。我查了一些资料,读了一些代码片段,并在Eclipse的Pydev中进行了测试。
现在我把这个脚本设置成了晚上定时运行。
这个脚本做什么呢?
它会收集dovecot.sieve文件中所有的电子邮件地址(主要是“Newsletter”规则里的那些)。然后,它会查看INBOX.Newsletter文件夹,找出哪些电子邮件地址没有注册,通过和收集到的地址进行对比。如果发现了新的地址,它会先保存一份旧的sieve文件,然后重新写入现有的文件。新的电子邮件地址会被添加到“Newsletter”规则中,这样这些邮件就会被转发到指定的Newsletter文件夹里。
#!/usr/bin/python2.7
import os, sys
#Get the already configured email senders...
addresses = {}
current_addresses = None
with open("/home/postman/.dovecot.sieve", "r") as sieveconf:
for line in sieveconf:
if "INBOX.Newsletter" in line:
break
if "Newsletter Filter" in line:
current_addresses = addresses.setdefault('found', [])
continue
if "from" in line and current_addresses != None:
line = line.split('"')
if (len(line) > 4) and (line[1] == "from"):
current_addresses.append(line[3])
continue
#save the count for later
addr_num = 0
addr_num = len(addresses['found'])
#iterate all files in all sub-directories of INBOX.Newsletter
for root, _,files in os.walk("/home/postman/Mails/.INBOX.Newsletter"):
#for each file in the current directory
for emaildir in files:
#open the file
with open(os.path.join(root, emaildir), "r") as mail:
#scan line by line
for line in mail:
if "From: " in line:
#arm boolean value for adding to list
found_sw = False
#extract substring from line
found = ((line.split('<'))[1].split('>')[0])
#compare found address with already existing addresses in dictionary
for m_addr in addresses['found']:
if m_addr == found:
#remember if the address is already in the dictionary
found_sw = True
break
if not found_sw:
#if the address is not included in the dictionary put it there
current_addresses.append(found)
break
# Now print out all the data
#import pprint
#print "addresses found:"
#print "=========================="
#pprint.pprint(addresses['found'])
#print
#print "orig_nmbr_of_addresses:" , addr_num
#print "found_nmbr_of_addresses:", len(addresses['found'])
#print "not_recorded_addresses:", (len(addresses['found']) - (addr_num))
#Compare if the address count has changed
if addr_num == len(addresses['found']):
#exit the script since no new addresses have been found
sys.exit
else:
#copy original sieve file for backup
import datetime
from shutil import copyfile
backupfilename = '.backup_%s.sieve'% datetime.date.today()
copyfile('dovecot.sieve', backupfilename)
#edit the existing sieve file and add the new entries
import fileinput
#open file for in place editing
for line in fileinput.input('dovecot.sieve', inplace=1):
#if the line before the last entry is reached
if addresses['found'][(addr_num - 2)] in line:
#print the line
print line,
#put new rules before the last line (just to avoid extra handling for last line, since the lines before are rather identical)
for x in range (addr_num, (len(addresses['found']))):
print ' address :is "from" "%s",'% addresses['found'][x]
else:
#print all other lines
print line,