<p>如果希望日期在开始处(其他2个不是注释中最重要的),并且希望匹配当前模式,可以使用<a href="https://www.regular-expressions.info/alternation.html" rel="nofollow noreferrer">alternation</a>:</p>
<pre><code>^([a-zA-Z]+ \d{1,2} \d{1,2}:\d{1,2}:\d{1,2})|([^ ]+)=([^ ]+)
</code></pre>
<ul>
<li><code>^</code>字符串的开头</li>
<li><code>([a-zA-Z]+ \d{1,2} \d{1,2}:\d{1,2}:\d{1,2})</code>捕获组1,匹配“类似日期”的模式</li>
<li><code>|</code>或</li>
<li><code>([^ ]+)=([^ ]+)</code>捕捉第2组和第3组中的值的初始模式</li>
</ul>
<p><a href="https://regex101.com/r/MywxGE/1" rel="nofollow noreferrer">Regex demo</a>| <a href="https://ideone.com/BTxSRr" rel="nofollow noreferrer">Python demo</a></p>
<p>例如</p>
<pre><code>import re
regex = r"^([a-zA-Z]+ \d{1,2} \d{1,2}:\d{1,2}:\d{1,2})|([^ ]+)=([^ ]+)"
test_str = "Aug 13 17:16:33 app-srv01 kernel: newConnection - IN=eth0 OUT= MAC=56:00:01:a1:5c:b7:fe:00:01:a1:5c:b7:08:00 SRC=91.103.125.80 DST=45.33.223.166 LEN=52 TOS=0x00 PREC=0x00 TTL=113 ID=21200 DF PROTO=TCP SPT=55743 DPT=445 WINDOW=8192 RES=0x00 SYN URGP=0"
print(list(map(lambda x: tuple(filter(None, x)), re.findall(regex, test_str))))
</code></pre>
<p>结果</p>
<blockquote>
<p>[('Aug 13 17:16:33',), ('IN', 'eth0'), ('MAC',
'56:00:01:a1:5c:b7:fe:00:01:a1:5c:b7:08:00'), ('SRC',
'91.103.125.80'), ('DST', '45.33.223.166'), ('LEN', '52'), ('TOS',
'0x00'), ('PREC', '0x00'), ('TTL', '113'), ('ID', '21200'), ('PROTO',
'TCP'), ('SPT', '55743'), ('DPT', '445'), ('WINDOW', '8192'), ('RES',
'0x00'), ('URGP', '0')]</p>
</blockquote>