解析日志文件中的特定数据时遇到问题

2024-04-27 17:55:18 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一项任务,主要任务是在日志文件中报告可疑活动。我还需要解决其他几个问题,但我最想关注的是“可疑活动”(如果我能掌握这一点,那么我很可能会有一个灯泡来帮助我处理其余的问题)。实现这一点的方法应该是记录用户在12:00到5:00之间登录的时间。一旦用户被标记为可疑,用户的姓名、电子邮件和域名应作为输出信息显示

我以前从未使用过日志文件,这是我第一次使用Python3(特别是PyCharm)处理日志文件。到目前为止,这项任务很有挑战性,因为我不知道从哪里开始这项任务。我最初计划使用正则表达式来匹配日志文件中的特定文本和字典中的键,但我不确定这是否是处理此任务的正确思路

下面是示例日志:Sample Behavior

这是一个用户日志文件 userlog.log

如果我的帖子有点混乱,我很抱歉,这是我第一次使用堆栈溢出。我的目标是收集我应该如何一步一步地完成这项任务的想法和想法。谢谢你的任何想法和想法。编辑:下面是粘贴的用户日志文件的一部分

2020-05-23 00:44:42登录mailserver.local melaina。gabeline@yahoo.com.mx
2020-05-15 10:54:11注销邮件服务器。本地服务器。stephco@miho-nakayama.com
2020-05-07 11:25:24登录myworkstation.local breena。benassi@gmx.net
2020-05-14 16:31:34注销webserver.local arti。karshner@mail2perry.com
2020-05-12 17:02:10登录mailserver.localqueen。ham@quiklinks.com
2020-05-30 23:01:30注销邮件服务器。本地maryelizabeth。stassen@freesurf.fr
2020-05-11 15:04:32注销myworkstation.local lupe。gave@freesurf.fr
2020-05-26 13:51:35注销邮件服务器。本地tarrin。evanoff@blacksburg.net
2020-05-15 02:21:39注销邮件服务器。本地maryelizabeth。stassen@freesurf.fr
2020-05-05 14:16:13登录mailserver.local aprilmarie。ulatowski@freesurf.fr
2020-05-21 03:53:37登录mailserver.local tarrin。naysmith@mail2champaign.com
2020-05-05 06:17:09登录webserver.local melaina。gabeline@yahoo.com.mx
2020-05-24 18:24:49注销myworkstation.local kira。pay@mail2zambia.com


Tags: 文件用户服务器comnetlocal邮件fr
1条回答
网友
1楼 · 发布于 2024-04-27 17:55:18

要读取日志,请使用readlines获取所有文件行,对于每一行,请使用split将该行划分为列。使用datetime比较时间

要获取仍然登录的用户列表,请在登录时将该用户添加到列表中,然后在注销时删除该用户。最后,使用“仍然登录”列表的剩余条目

请尝试以下代码:

ss = '''
2020-05-23 00:44:42 login mailserver.local melaina.gabeline@yahoo.com.mx
2020-05-15 10:54:11 logout mailserver.local sevan.stephco@miho-nakayama.com
2020-05-07 11:25:24 login myworkstation.local breena.benassi@gmx.net
2020-05-14 16:31:34 logout webserver.local arti.karshner@mail2perry.com
2020-05-12 17:02:10 login mailserver.local queen.ham@quiklinks.com
2020-05-30 23:01:30 logout mailserver.local maryelizabeth.stassen@freesurf.fr
2020-05-11 15:04:32 logout myworkstation.local lupe.gave@freesurf.fr
2020-05-26 13:51:35 logout mailserver.local tarrin.evanoff@blacksburg.net
2020-05-15 02:21:39 logout mailserver.local maryelizabeth.stassen@freesurf.fr
2020-05-05 14:16:13 login mailserver.local aprilmarie.ulatowski@freesurf.fr
2020-05-21 03:53:37 login mailserver.local tarrin.naysmith@mail2champaign.com
2020-05-05 06:17:09 login webserver.local melaina.gabeline@yahoo.com.mx
2020-05-24 18:24:49 logout myworkstation.local kira.pay@mail2zambia.com
'''.strip()

with open('userlog.log','w') as f: f.write(ss)  # write log file

#############################

import datetime

loggedin = set()  # users still logged in
allusers = set()  # all users

FiveAM = datetime.datetime.strptime('05:00:00', '%H:%M:%S')

with open('userlog.log') as f:
   lines = f.readlines()
   for ln in lines:
      cols = ln.split()  # split at spaces, date is col 0, time is col 1
      dt = datetime.datetime.strptime(' '.join(cols[:2]), '%Y-%m-%d %H:%M:%S')  # date\time
      tm = datetime.datetime.strptime(cols[1], '%H:%M:%S')  # time
      print(cols)
      allusers.add(cols[4])
      if cols[2] == 'login' and tm < FiveAM:
          print('>>> Suspicious! ' + cols[4] + ' is a spy!')
      if cols[2] == 'login':
          loggedin.add(cols[4])  # email, add to login list
      else: # logout
          if cols[4] in loggedin: loggedin.remove(cols[4])  # remove from login list
          
print('\n  Users logged in   \n', '\n'.join(loggedin), sep='')  # users still logged in

print('\n  Users logged out   \n', '\n'.join(allusers-loggedin), sep='')  # users not still logged in

输出

['2020-05-23', '00:44:42', 'login', 'mailserver.local', 'melaina.gabeline@yahoo.com.mx']
>>> Suspicious! melaina.gabeline@yahoo.com.mx is a spy!
['2020-05-15', '10:54:11', 'logout', 'mailserver.local', 'sevan.stephco@miho-nakayama.com']
['2020-05-07', '11:25:24', 'login', 'myworkstation.local', 'breena.benassi@gmx.net']
['2020-05-14', '16:31:34', 'logout', 'webserver.local', 'arti.karshner@mail2perry.com']
['2020-05-12', '17:02:10', 'login', 'mailserver.local', 'queen.ham@quiklinks.com']
['2020-05-30', '23:01:30', 'logout', 'mailserver.local', 'maryelizabeth.stassen@freesurf.fr']
['2020-05-11', '15:04:32', 'logout', 'myworkstation.local', 'lupe.gave@freesurf.fr']
['2020-05-26', '13:51:35', 'logout', 'mailserver.local', 'tarrin.evanoff@blacksburg.net']
['2020-05-15', '02:21:39', 'logout', 'mailserver.local', 'maryelizabeth.stassen@freesurf.fr']
['2020-05-05', '14:16:13', 'login', 'mailserver.local', 'aprilmarie.ulatowski@freesurf.fr']
['2020-05-21', '03:53:37', 'login', 'mailserver.local', 'tarrin.naysmith@mail2champaign.com']
>>> Suspicious! tarrin.naysmith@mail2champaign.com is a spy!
['2020-05-05', '06:17:09', 'login', 'webserver.local', 'melaina.gabeline@yahoo.com.mx']
['2020-05-24', '18:24:49', 'logout', 'myworkstation.local', 'kira.pay@mail2zambia.com']

  Users logged in  
aprilmarie.ulatowski@freesurf.fr
melaina.gabeline@yahoo.com.mx
queen.ham@quiklinks.com
breena.benassi@gmx.net
tarrin.naysmith@mail2champaign.com

  Users logged out  
maryelizabeth.stassen@freesurf.fr
lupe.gave@freesurf.fr
sevan.stephco@miho-nakayama.com
arti.karshner@mail2perry.com
tarrin.evanoff@blacksburg.net
kira.pay@mail2zambia.com

相关问题 更多 >