如何使用textblob机器学习来检测正确的电子邮件地址?

2024-05-09 17:33:52 发布

您现在位置:Python中文网/ 问答频道 /正文

我想检测正确的电子邮件地址,但我的代码是给我的标签与大概率的数据集,显然它不象我预期的工作

代码如下:

from textblob.classifiers import NaiveBayesClassifier
files = [
  ("data_train/email_positive.txt", "yes"), 
  ("data_train/email_negative.txt", "no")
]
train = []; cl = None

for file_txt in files:   
    email_train_raw = []        
    with open(file_txt[0]) as f: 
        email_train_raw = f.readlines()

    for email in email_train_raw:
        e = email.replace("\n", "")
        train.append( (e, file_txt[1]) )

cl = NaiveBayesClassifier(train)
print cl.classify("wrong_email@2x.png")
# Output: yes 
# it would be: "no"

一些正确的电子邮件数据集:

hello@3commerceinc.com
sales@ablefreight.com
dispatchwaycross@absolutewl.com
ops@absolutewl.com
tol@absolutewl.com
email@gmail.com
email@hotmail.com
. . . 

一些不正确的电子邮件数据集:

pause@2x.png
video@2x.png
right@2x.png
play@2x.png
circle-hover@2x.png
preloader@2x.gif
left@2x.png
circle@2x.png
. . . 

Tags: 数据代码txtcomdatarawpngcl