如何从未格式化字符串中提取ip和userid

2024-06-16 13:08:13 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一根绳子

 Jun 11 02:47:04 webwork-tlv tcp: 2013-06-11 02:47:04 - ive - [84.11.11.11] hacker(Secure ID)[Manage System] - Host Checker policy 'Machine center' passed on host 84.11.11.11  for user 'hacker'

一些绳子看起来像那样

 Jun 11 00:13:26 webwork-tlv tcp: 2013-06-11 00:13:25 - ive - [10.11.12.19] hacker(Secure ID)[Manage System] - Sensor tlv-entid-001 - timestamp=[Tue Jun 11 02:23:42 2013 ] severity=[4] policyStr=[IDP 20110132] category=[attack] protocol=[tcp] attackStr=[HTTP:XSS:HTML-SCRIPT-IN-URL-VA] rulebaseStr=[IDS] rulebaseType=[Main Rule Base] srcAddr=[10.11.12.19] srcPort=[3333] dstAddr=[66.11.12.13] dstPort=[80] action=[drop] policyVersion=[41] ruleNumber=[3]

我想在开始提取日期,ip在[]之间,但是如果是内部ip(从10或192开始)则无需提取和id黑客之前(SecureID)

所以结果应该是ip:84.11.11.11,id:hacker

先谢谢你


Tags: ipidhostmanagecheckersystemhackerjun
2条回答

有点乏味,但是:

s = "Jun 11 02:47:04 webwork-tlv tcp: 2013-06-11 02:47:04 - ive - [84.11.11.11] hacker(Secure ID)[Manage System] - Host Checker policy 'Machine center' passed on host 84.11.11.11  for user 'hacker'."
parts = s.split('[')[1].split(']')
{'ip': parts[0], 'id': parts[1].split('(Secure ID)')[0]}
>>> regex = re.compile("(\[\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\]) ([a-zA-Z0-9]+)")
>>> r = regex.search(string)

# List the groups found
>>> r.groups()
(u'[84.11.11.11]', u'hacker')

相关问题 更多 >