如何从包含某些字符的文件中提取特定单词?

2024-04-25 12:39:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文件log.txt,其中包含:

Router:94.126.126.109
Name:nl-rtm02a-ra2
show running-config interface^M
^MWed Jul 11 12:42:03.409 CET^M
! ****  Configuration start **** RING  rt72-central  RA2 ****^M
! # RING INTERFACE CONFIGURATION^M

 service-policy output NA4-PM-FRFB+COS^M
 ipv4 address 84.116.244.181 255.255.255.252^M
 bundle minimum-active links 1^M
 load-interval 30^M
 flow ipv4 monitor NA4-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
 flow ipv6 monitor NA4-IPV6-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
!^M
interface Bundle-Ether1001^M
 description ** ICL to RA2-SAT1 **^M
 vrf NV_Mgmt^M
 ipv4 point-to-point^M
 ipv4 unnumbered Loopback1000^M
 load-interval 30^M
 flow ipv4 monitor NA4-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
 flow ipv6 monitor NA4-IPV6-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
 nv^M
  satellite-fabric-link satellite 1001^M
   remote-ports GigabitEthernet 0/0/0-43^M
  !^M
 !^M
!^M
interface Bundle-Ether2000^M
 description ** LACP Uplink to rt53cbr68 **^M
 mtu 9192^M
 bundle minimum-active links 1^M
 load-interval 30^M
!^M
interface Bundle-Ether2000.251^M
 description ** rt53abr68 IPv4 B-Side **^M
 vrf 03109128:NL_CMTS_ACCESS^M
 ipv4 mtu 1500^M
 ipv4 address 212.142.4.45 255.255.255.252^M
 flow ipv4 monitor NA4-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
 flow ipv6 monitor NA4-IPV6-MONITOR-MAP sampler NA4-SAMPLER-MAP ingress^M
 encapsulation dot1q 251^M
!^M
interface Bundle-Ether2000.651^M
 description ** rt53dbr68 IPv6 B-Side **^M
 ipv6 nd prefix default no-autoconfig^M
 ipv6 address 2a02:a200:40:56::1/64^M
 encapsulation dot1q 651^M
!^M
interface Bundle-Ether2000.701 l2transport^M
 description ** BSOD SDN-NFV Traffic rt53cbr68 **^M
 encapsulation dot1q 2501-2699^M

在这个文件中,我需要提取包含"cbr""abr""dbr"的单词,并将其存储在CSV文件中。你知道吗

例如,在上面的内容中,我想提取:

1.rt53cbr68 
2.rt53abr68 
3.rt53dbr68

我尝试了以下代码:

with open("file.txt", "r") as f:
searchlines = f.readlines()


for i, line in enumerate(searchlines):
    if "cbr" in line:
        for l in searchlines[i:i+3]:
           print l

还有一件事,我想从文件内容中获取路由器值,并将其存储在一个变量中。。你知道吗


Tags: 文件mapaddressdescriptionflowinterfacemonitorbundle
3条回答

这将匹配任何包含abrcbrdbr的描述行

>>> import re
>>> list(enumerate(re.findall(r'description.*\s(.*?[cad]br.*?)\s', data)))
[(0, 'rt53cbr68'), (1, 'rt53abr68'), (2, 'rt53dbr68'), (3, 'rt53cbr68')]
>>> 

在您的示例中,行并不重要,这就是为什么我建议使用read()而不是readlines(),并使用split()返回每个单词的列表(使用split()会将文本拆分为“”和“\n”)。你知道吗

with open("file.txt", "r") as f:
    words = f.read().split()
    routerNames = []
    z = 1
    for wrd in words:
        if ("cbr" in wrd) or ("abr" in wrd) or ("dbr" in wrd):
            routerNames.append(str(z)+ ". " + wrd)
            z+=1

    with open("file2.txt","w") as g:
        g.write("\n".join(routerNames))

注意:此代码将获取包含这些字符的所有单词,即使是您不需要的单词。我建议添加另一个条件以最小化错误。你知道吗

with open("file.txt", "r") as f:
    words = f.read().split()


    for wrd in words:
        if (("cbr" in wrd) or ("abr" in wrd) or ("dbr" in wrd)) and ("rt" in wrd):
            ...

为了匹配您的值,您可以使用带有finditer的正则表达式。你知道吗

您可以匹配一个或多个单词字符\w+和一个或多个数字\d+,然后使用字符类[cad],它将匹配这些字符中的任何一个,后跟br和一个或多个数字。你知道吗

对于router值,可以使用命名组(?P<router>\d+(?:\.\d+)+)和正向lookbehind (?<=来断言左侧的内容是Router,前面是单词边界\b。你知道吗

匹配它们和alternation|

^{}

Demo

相关问题 更多 >