通过特定分隔符进行字符串操作并写入文本fi

2024-03-29 09:31:07 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在写一个函数,它接受一个文件更新.txt作为输入。文件如下所示:

---------------------------------------------------
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
    Type: 16(BGP4MP)
    Subtype: 4(BGP4MP_MESSAGE_AS4)
    Length: 39
BGP4MP_MESSAGE_AS4
    Peer AS Number: 37989
    Local AS Number: 12654
    Interface Index: 0
    Address Family: 1(IPv4)
    Peer IP Address: 203.123.48.6
    Local IP Address: 193.0.4.28
BGP Message
    Marker: -- ignored --
    Length: 19
    Type: 4(KEEPALIVE)
---------------------------------------------------
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
    Type: 16(BGP4MP)
    Subtype: 4(BGP4MP_MESSAGE_AS4)
    Length: 118
BGP4MP_MESSAGE_AS4
    Peer AS Number: 1836
    Local AS Number: 12654
    Interface Index: 0
    Address Family: 1(IPv4)
    Peer IP Address: 146.228.1.3
    Local IP Address: 193.0.4.28
BGP Message
    Marker: -- ignored --
    Length: 98
    Type: 2(UPDATE)
    Withdrawn Routes Length: 0
    Total Path Attribute Length: 71
    Path Attribute Flags/Type/Length: 0x40/1/1
        ORIGIN: 0(IGP)
    Path Attribute Flags/Type/Length: 0x40/2/42
        AS_PATH
            Path Segment Type: 2(AS_SEQUENCE)
            Path Segment Length: 10
            Path Segment Value: 1836 174 6453 37282 37088 37629 37629 37629 37629 37629
    Path Attribute Flags/Type/Length: 0x40/3/4
        NEXT_HOP: 146.228.1.3
    Path Attribute Flags/Type/Length: 0xc0/8/12
        COMMUNITY: 1836:110 1836:6000 1836:6031
    NLRI: 154.65.7.0/24
---------------------------------------------------

文件是一个“块”序列。每个块都用虚线括起来

---------------------------------------------------
# Block (n)
---------------------------------------------------
# Block (n+1)
---------------------------------------------------
# Block (n+2) , etc

我想逐块读取整个文件,并返回一个只包含以下字段行的文本文件:Timestamp、Peer AS number、Local AS number、Peer IP Address、Local IP Address。你知道吗

生成的.txt文件应如下所示:

---------------------------------------------------
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
BGP4MP_MESSAGE_AS4
    Peer AS Number: 37989
    Local AS Number: 12654
    Peer IP Address: 203.123.48.6
    Local IP Address: 193.0.4.28
---------------------------------------------------
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
BGP4MP_MESSAGE_AS4
    Peer AS Number: 1836
    Local AS Number: 12654
    Peer IP Address: 203.123.48.6
    Local IP Address: 193.0.4.28
---------------------------------------------------

理想情况下,我想覆盖更新.txt用新的文本文件不要浪费空间,并将其保存在新的目录“解析更新”。你知道吗

我知道它是最小的,因为我被一行破折号分隔符困住了,但我的代码如下所示:

import sys
import os

def parser(filename):
    info = open(filename, 'r+')
    info.read()

    #Here comes the string manipulation code
    #info.split( '---------------------------------------------------')

    info.close()
    print 'The file has been parsed successfully !!'

def main():
    parser('updates.txt')


if __name__=='__main__':
    main()

Tags: 文件pathipnumbermessageaddresslocalas
2条回答

在这种特定情况下,在解析之前甚至不需要将块分解为单独的部分。你可以一行一行地检查与你想要的信息类型的匹配。你知道吗

out_lines = []
regexes = [
    r'^-+$',
    r'^MRT HEADER\s*$',
    r'^\s*Timestamp:.*$',
    r'^BGP4MP_MESSAGE_AS4\s*$',
    r'^\s*Peer AS Number:.*$',
    r'^\s*Local AS Number:.*$',
    r'^\s*Peer IP Address:.*$',
    r'^\s*Local IP Address:.*$',
]
with open('file.txt', 'r') as f:
    for line in f:
        for regex in regexes:
            if re.match(regex, line):
                out_lines.append(line)
                break

with open('file.txt', 'w') as f:
     f.write('\n'.join(out_lines))
>>> with open('results.txt', 'wb') as r:
...     with open('updates.txt', 'rb') as u:
...         for line in u.readlines():
...             if '-'*51 in line:
...                 r.write(line)
...             else:
...                 if any(field in line for field in ['Timestamp', 'Peer AS Number', 'Local AS Number', 'Peer IP Address', 'Local IP Address','MRTHeader']):
...                     r.write(line)

结果文件如下:

$ cat results.txt
                         -
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
    Peer AS Number: 37989
    Local AS Number: 12654
    Peer IP Address: 203.123.48.6
    Local IP Address: 193.0.4.28
                         -
MRT Header
    Timestamp: 1453939200(2016-01-28 01:00:00)
    Peer AS Number: 1836
    Local AS Number: 12654
    Peer IP Address: 146.228.1.3
    Local IP Address: 193.0.4.28
                         -

相关问题 更多 >