Pandas读取<key>=<value>的日志文件

1条回答

网友

1楼 · 发布于 2024-04-25 17:03:47

感谢@edouardtheron&the moduleshlex向正确方向轻推。在

如果您有更好的解决方案，请随时回答

但是，我想到的是，首先，导入库：

import shlex
import pandas as pd

创建一些示例数据：

^{pr2}$

创建与整行匹配但将其分组到

1:开始日期((?:[a-zA-Z]{3,4} ){2} \d \d\d:\d\d:\d\d \d{4})

2:其他一切(.*)

patt = re.compile('((?:[a-zA-Z]{3,4} ){2} \d \d\d:\d\d:\d\d \d{4}) (.*)')

通过{cd4}在循环中使用

sers = []
for line in test_string.split('\n'):

    matt = re.match(patt, line)
    if not matt:
        # skip the empty lines
        continue
    # Extract Groups
    time, key_values = matt.groups()

    ser = pd.Series(dict(token.split('=', 1) for token in shlex.split(key_values)))
    ser['log_time'] = time
    sers.append(ser)

最后将所有行连接到一个数据帧中：

# Concat serieses into a dataframe
df = pd.concat(sers, axis=1).T
# Change the type of 'log_time' to an actual date
df['log_time'] = pd.to_datetime(df['log_time'], format='%a %b  %d %X %Y', exact=True)

这将生成以下数据帧：

   bar  foo                       msg  spam            log_time
0  321  123  String with spaces in it  eggs 2018-04-03 08:51:05
1  222  111          Different string  eggs 2018-04-03 10:31:46

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas读取<key>=<value>的日志文件

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >