如何使用awk或sed拆分日志文件。替换python scrip

2024-04-27 00:57:39 发布

您现在位置:Python中文网/ 问答频道 /正文

假设您每天都有一个仪器日志文件。在白天,可能会发生几次重新启动。出于某种原因,您希望每次重新启动都有一个文件。你知道吗

最后,我使用python来做这件事,但我想用awk或sed做同样的事情。 请告诉我你的想法。你知道吗

python脚本分割工具_对数.py你知道吗

def split_instrument_log(filename):
    first_line = '--- ServiceHost Start ---'
    count = 0
    with open(filename, 'r') as handle:
        text = handle.read()
        split_text = text.split('\n' + first_line)
        for split in split_text:
            split_file_name = filename + "." + str(count)
            with open(split_file_name, 'w') as split_handle:
                if count > 0:
                    split_handle.write(first_line)
                split_handle.write(split)
            count = count + 1

filename = "instrument.log";
split_instrument_log(filename)

示例仪器.log地址:

--- ServiceHost Start ---
11:43:54.745 00000001 HOST I  Creating System 2/19/2018 11:43:54 AM
...
--- ServiceHost Start ---
14:47:37.071 00000001 HOST I  Creating System 2/19/2018 2:47:37 PM
...
--- ServiceHost Start ---
18:27:57.463 00000001 HOST I  Creating System 2/19/2018 6:27:57 PM
...

结果仪器.log0

--- ServiceHost Start ---
11:43:54.745 00000001 HOST       I  Creating System 2/19/2018 11:43:54 AM
...

我有另一个日志,它以时间戳和地址开头

[05/02/2018 13:32:30.160 UTC] Main Thread (0xb4692000)/ 0 INF socMainExecutable

如何更新awk脚本,但请注意时间戳和地址不是常量?你知道吗


Tags: textcreatingloghost地址countlinefilename
1条回答
网友
1楼 · 发布于 2024-04-27 00:57:39

对于awk,这是非常直接的:

输入:

$ more instrument.log
 - ServiceHost Start  -
11:43:54.745 00000001 HOST I  Creating System 2/19/2018 11:43:54 AM
blabla1
blabla2
blabla3
...
 - ServiceHost Start  -
14:47:37.071 00000001 HOST I  Creating System 2/19/2018 2:47:37 PM
...
blabla4
blabla5
blabla6
 - ServiceHost Start  -
18:27:57.463 00000001 HOST I  Creating System 2/19/2018 6:27:57 PM
...
blabla7
blabla8
blabla9

awk脚本:

awk -v i=-1 '/ - ServiceHost Start  -/{i++; print $0 > "instrument.log."i; next}{print $0 >> "instrument.log."i}' instrument.log

输出:

$ more instrument.log.?
::::::::::::::
instrument.log.0
::::::::::::::
 - ServiceHost Start  -
11:43:54.745 00000001 HOST I  Creating System 2/19/2018 11:43:54 AM
blabla1
blabla2
blabla3
...
::::::::::::::
instrument.log.1
::::::::::::::
 - ServiceHost Start  -
14:47:37.071 00000001 HOST I  Creating System 2/19/2018 2:47:37 PM
...
blabla4
blabla5
blabla6
::::::::::::::
instrument.log.2
::::::::::::::
 - ServiceHost Start  -
18:27:57.463 00000001 HOST I  Creating System 2/19/2018 6:27:57 PM
...
blabla7
blabla8
blabla9

说明:

  • -v i=-1传递一个变量iawk,初始值在-1,您也可以在这样的BEGIN子句中定义它:BEGIN{i=-1}。你知道吗
  • / - ServiceHost Start -/{i++; print $0 > "instrument.log."i; next}每当awk找到包含 - ServiceHost Start -的行时,它将增加i,并在转到下一行之前将行内容打印到文件"instrument.log."i。(如果文件存在,它将覆盖该文件)
  • {print $0 >> "instrument.log."i}对于其他行,只需附加到文件"instrument.log."i

相关问题 更多 >