在python2.7中,如何计算日志文件占用的总时间?

2024-06-11 04:58:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我有几个日志文件,它们的结构如下:

Sep  9 12:42:15 apollo sshd[25203]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=189.26.255.11 

Sep  9 12:42:15 apollo sshd[25203]: pam_succeed_if(sshd:auth): error retrieving information about user ftpuser

Sep  9 12:42:17 apollo sshd[25203]: Failed password for invalid user ftpuser from 189.26.255.11 port 44061 ssh2

Sep  9 12:42:17 apollo sshd[25204]: Received disconnect from 189.26.255.11: 11: Bye Bye

Sep  9 19:12:46 apollo sshd[30349]: Did not receive identification string from 199.19.112.130

Sep 10 03:29:48 apollo unix_chkpwd[4549]: password check failed for user (root)

Sep 10 03:29:48 apollo sshd[4546]: pam_unix(sshd:auth): authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=221.12.29.170  user=root

Sep 10 03:29:51 apollo sshd[4546]: Failed password for root from 221.12.29.170 port 56907 ssh2

有更多的日期和时间,但这是一个例子。我想知道我该如何计算文件覆盖的总时间。我试过几次,都有5个小时没有成功

我先尝试了一下,结果很接近,但没有达到我想要的效果,它不断重复日期:

with open(filename, 'r') as file1:
        lines = file1.readlines()
        for line in lines:
            linelist = line.split()
            date2 = int(linelist[1])
            time2 = linelist[2]
            print linelist[0], linelist[1], linelist[2]
            if date1 == 0:
                date1 = date2
                dates.append(linelist[0] + ' ' + str(linelist[1]))
            if date1 < date2:
                date1 = date2
                ttimes.append(datetime.strptime(str(ltime1), FMT) - datetime.strptime(str(time1), FMT))
                time1 = '23:59:59'
                ltime1 = '00:00:00'
                dates.append(linelist[0] + ' ' + str(linelist[1]))
            if time2 < time1:
                time1 = time2
            if time2 > ltime1:
                ltime1 = time2

Tags: fromforifseppamapollostruser
2条回答

如果条目是按时间顺序排列的,您可以只查看第一个条目和最后一个条目:

entries = lines.split("\n")

first_date = entries[0].split("apollo")[0]
last_date = entries[len(entries)-1].split("apollo")[0]

我们没有年份,所以我选了当年。读取所有行,将月索引转换为月索引,并分析每个日期

然后对其进行排序(这样即使日志混合也能工作),并首先取&;最后一项。删减。好好享受

from datetime import datetime

months = ["","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]
current_year = datetime.now().year

dates = list()
with open(filename, 'r') as file1:
    for line in file1:
        linelist = line.split()
        if linelist:  # filter out possible empty lines
            linelist[0] = str(months.index(linelist[0]))  # convert 3-letter months to index
            date2 = int(linelist[1])
            z=datetime.strptime(" ".join(linelist[0:3])+" "+str(current_year),"%m %d %H:%M:%S %Y") # compose & parse the date
            dates.append(z)  # store in list

dates.sort()  # sort the list
first_date = dates[0]
last_date = dates[-1]

# print report & compute time span
print("start {}, end {}, time span {}".format(first_date,last_date,last_date-first_date))

结果:

start 2016-09-09 12:42:15, end 2016-09-10 03:29:51, time span 14:47:36

请注意,由于缺少年份信息,它在12月31日和1月1日之间无法正常工作。我想如果我们找到一月&;日志中的12月,然后假设是下一年的1月。还不支持

相关问题 更多 >