解析Microsoft DNS调试日志

1 投票

3 回答

1355 浏览

提问于 2025-04-18 03:36

我想解析微软DNS调试日志的响应。我的想法是提取出域名，并打印出每个域名在调试日志中出现的次数。通常，我会先用类似于 grep -v " R " log > tmp 的命令，把所有的响应重定向到一个文件里。然后再手动用 grep domain tmp 来查找域名。我觉得应该有更好的方法。

20140416 01:38:52 588 PACKET  02030850 UDP Rcv 192.168.0.10 2659 R Q [8281   DR SERVFAIL] A     (11)quad(3)sub(7)domain(3)com(0)
20140416 01:38:52 588 PACKET  02396370 UDP Rcv 192.168.0.5 b297 R Q [8281   DR SERVFAIL] A     (3)pk(3)sub(7)domain(3)com(0)
20140415 19:46:24 544 PACKET  0261F580 UDP Snd 192.168.0.2  795a   Q [0000       NOERROR] A     (11)tertiary(7)domain(3)com(0)
20140415 19:46:24 544 PACKET  01A47E60 UDP Snd 192.168.0.1 f4e2   Q [0001   D   NOERROR] A     (11)quad(3)sub(7)domain(3)net(0)

对于上面的数据，像下面这样的输出会很好：

domain.com 3
domain.net 1

这将表明脚本或命令找到了两个关于 domain.com 的查询记录。我不关心更高层次的主机是否被计算在内。用shell命令或者Python都可以。这里有一些伪代码，希望能更清楚地表达我的问题。

theFile = open('log','r')
FILE = theFile.readlines()
theFile.close()
printList = []
# search for unique queries and count them
for line in FILE:
    if ('query for the " Q " field' in line):
         # store until count for this uniq value is complete
         printList.append(line)

for item in printList:
    print item    # print the summary which is a number of unique domains

数据处理自动化脚本 shell命令日志解析域名提取查询记录 dns调试

3 个回答

这可能没有完全符合你要求的输出，但这样可以满足你的需求吗？

dns = [line.strip().split()[-1] for line in file(r"path\to\file").readlines() if "PACKET" in line]
domains = {}
for d in dns:
    if not domains.has_key(d):
        domains[d] = 1
    else:
        domains[d] += 1

for k, v in domains.iteritems():
    print "%s %d" % (k, v)

回答于 2025-04-18 由 Python大师

分享举报

这样做怎么样，有点像蛮力的方法：

>>> from collections import Counter
>>> with open('t.txt') as f:
...     c = Counter('.'.join(re.findall(r'(\w+\(\d+\))',line.split()[-1])[-2:]) for line in f)
... 
>>> for domain, count in c.most_common():
...    print domain,count
... 
domain(3).com(0) 3
domain(3).net(0) 1

回答于 2025-04-18 由 Python大师

分享举报

也许可以这样做？我对正则表达式不是很精通，但根据我对你要解析的格式的理解，这个方法应该能解决问题。

#!/usr/bin/env python

import re

ret = {}

with open('log','r') as theFile:
    for line in theFile:
        match = re.search(r'Q \[.+\].+\(\d+\)([^\(]+)\(\d+\)([^\(]+)',line.strip())
        if match != None:
            key = ' '.join(match.groups())
            if key not in ret.keys():
                ret[key] = 1
            else:
                ret[key] += 1

for k in ret.keys():
    print '%s %d' % (k,ret[k])

回答于 2025-04-18 由 Python大师

分享举报

解析Microsoft DNS调试日志

3 个回答

撰写回答