多线程日志fi解析结果的重构

<timestamp_in> <first_function_call_input> <thread:1> input_parameter_1: value input_parameter_2: value <timestamp_in> <another_function_call_input> <thread:2> input_parameters: values <timestamp_out> <another_function_call_output> <thread:2> output_parameters: values <timestamp_out> <first_function_call_output> <thread:1> output_parameters: values

>>> print(parse_results.dump()) -[0]: -function: first_function -thread: 1 -timestamp_in: ... -timestamp_out: ... -input_parameters: [0]: -parameter_name: input_parameter_1 -parameter_value: value [1]: -parameter_name: input_parameter_2 -parameter_value: value -output_parameters: [0]: ... ... -[1]: -function: another_function -thread: 2 ...

from pyparsing import * ParserElement.inlineLiteralsUsing(Suppress) key_val_lines = OneOrMore(Group(Word(alphas)('key') + ':' + Word(nums)('val')))('parameters') special_key_val_lines = OneOrMore(Group(Word(printables)('key') + ':' + Word(alphas)('val')))('special_parameters') log = OneOrMore(Group(key_val_lines | special_key_val_lines))('contents').setDebug() test_string =''' foo : 1 bar : 2 special_key1! : wow another_special : abc normalAgain : 3''' parse_results = log.parseString(test_string).dump() print(parse_results)

- contents: [[['foo', '1'], ['bar', '2']], [['special_key1!', 'wow'], ['another_special', 'abc']], [['normalAgain', '3']]] [0]: [['foo', '1'], ['bar', '2']] - parameters: [['foo', '1'], ['bar', '2']] [0]: ['foo', '1'] - key: 'foo' - val: '1' [1]: ['bar', '2'] - key: 'bar' - val: '2' [1]: [['special_key1!', 'wow'], ['another_special', 'abc']] - special_parameters: [['special_key1!', 'wow'], ['another_special', 'abc']] [0]: ['special_key1!', 'wow'] - key: 'special_key1!' - val: 'wow' [1]: ['another_special', 'abc'] - key: 'another_special' - val: 'abc' [2]: [['normalAgain', '3']] - parameters: [['normalAgain', '3']] [0]: ['normalAgain', '3'] - key: 'normalAgain' - val: '3'

1条回答

网友

1楼 · 发布于 2024-04-23 15:57:32

这纯粹是一种判断，我已经用这两种风格编写了解析器。你知道吗

在这种情况下，我的直觉告诉我，如果您将解析器和解析操作集中在分组、转换和命名各个日志项的各个部分上，然后使用单独的方法根据各种分组策略重新组织它们，那么代码就会更清晰。我的理由是日志消息结构已经有点复杂了，因此您的解析器将有足够的工作要做，以便将每条消息拉成一个统一的形式。另外，您的分组策略可能会有一些变化（需要收集一些小时间窗口内的项目，而不仅仅是精确的时间戳匹配），在单独的后处理方法中这样做将使这些更改本地化。你知道吗

从测试的角度来看，这还允许您分别测试重构代码和解析代码，也许可以使用dict或namedtuples的列表来模拟来自单独日志记录的解析结果。你知道吗

太长了，读不下去了，我会用后处理方法来对你的解析日志记录进行最终的排序/重组。你知道吗

编辑：要就地修改解析结果，请定义一个采用单个参数（我通常将其命名为tokens）的解析操作，并使用典型的list或dict mutators就地修改：

def rearrange(tokens):
    # mutate tokens in place
    tokens.contents[0].parameters.append(tokens.contents[2].parameters[0])

log.addParseAction(rearrange)

如果返回None（如本例所示），那么传入的tokens结构将保留为要返回的token结构。如果返回一个非None值，那么新的返回值将替换解析器输出中给定的标记。整数解析器就是这样将解析的字符串转换为实际整数的，或者日期/时间解析器将解析的字符串转换为Pythondatetime的

相关问题更多 >

编程相关推荐

热门问题

热门文章