Python将文本文件解析为csv文件

2024-06-16 13:23:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个文本文件,它是我和Netmiko一起运行的一个命令的输出,该命令用于从Cisco WLC检索对我们的WiFi网络造成干扰的东西的数据。我把我需要的东西从最初的60万行代码精简到几千行,如下所示:

AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017   

我们的目标是最终得到一个csv文件,去掉“AP Name…”,并将剩下的信息与下一行中的其余信息放在同一行。问题是有些人在AP名称下面有两行,有些人有1行或没有。我已经做了8个小时,找不到最好的方法来实现这一点。你知道吗

这是我尝试使用的最新版本的代码,有什么建议可以让它工作吗?我只想在excel中加载一些内容,并创建一个包含以下内容的报告:

with open(outfile_name, 'w') as out_file:
    with open('wlc-interference_raw.txt', 'r')as in_file:
        #Variables
        _ap_name = ''
        _temp = ''
        _flag = False
        for i in in_file:
            if 'AP Name' in i:
                #write whatever was put in the temp file to disk because new ap now
                #add another temp variable in case an ap has more than 1 interferer and check if new AP name
                out_file.write(_temp)
                out_file.write('\n')
                #print(_temp)
                _ap_name = i.lstrip('AP Name.......................................... ')
                _ap_name = _ap_name.rstrip('\n')
                _temp = _ap_name
                #print(_temp)
            elif '----' in i:
                pass
            elif 'Class Type' in i:
                pass
            else:
                line_split = i.split()
                for x in line_split:
                    _temp += ','
                    _temp += x
                _temp += '\n'

Tags: nameinouttempdecfilewriteap
1条回答
网友
1楼 · 发布于 2024-06-16 13:23:47

我认为最好的选择是读取文件的所有行,然后分成以AP Name开头的部分。然后您可以分析每个部分。你知道吗

示例

s = """AP Name.......................................... 010-HIGH-FL4-AP04
Microwave Oven      11       10      -59         Mon Dec 18 08:21:23 2017   
WiMax Mobile               11       0       -84         Fri Dec 15 17:09:45 2017   
WiMax Fixed                11       0       -68         Tue Dec 12 09:29:30 2017   
AP Name.......................................... 010-2nd-AP04
Microwave Oven             11       10      -61         Sat Dec 16 11:20:36 2017   
WiMax Fixed                11       0       -78         Mon Dec 11 12:33:10 2017   
AP Name.......................................... 139-FL1-AP03
Microwave Oven             6        18      -51         Fri Dec 15 12:26:56 2017   
AP Name.......................................... 010-HIGH-FL3-AP04
Microwave Oven             11       10      -55         Mon Dec 18 07:51:23 2017   
WiMax Mobile               11       0       -83         Wed Dec 13 16:16:26 2017"""

import re

class AP:
    """ 
    A class holding each section of the parsed file
    """
    def __init__(self):
        self.header = ""
        self.content = []

sections = []
section = None
for line in s.split('\n'):  # Or 'for line in file:'
    # Starting new section
    if line.startswith('AP Name'):
        # If previously had a section, add to list
        if section is not None:
            sections.append(section)  
        section = AP()
        section.header = line
    else:
        if section is not None:
            section.content.append(line)
sections.append(section)  # Add last section outside of loop


for section in sections:
    ap_name = section.header.lstrip("AP Name.")  # lstrip takes all the characters given, not a literal string
    for line in section.content:
        print(ap_name + ",", end="") 
        # You can extract the date separately, if needed
        # Splitting on more than one space using a regex
        line = ",".join(re.split(r'\s\s+', line))
        print(line.rstrip(','))  # Remove trailing comma from imperfect split

输出

010-HIGH-FL4-AP04,Microwave Oven,11,10,-59,Mon Dec 18 08:21:23 2017
010-HIGH-FL4-AP04,WiMax Mobile,11,0,-84,Fri Dec 15 17:09:45 2017
010-HIGH-FL4-AP04,WiMax Fixed,11,0,-68,Tue Dec 12 09:29:30 2017
010-2nd-AP04,Microwave Oven,11,10,-61,Sat Dec 16 11:20:36 2017
010-2nd-AP04,WiMax Fixed,11,0,-78,Mon Dec 11 12:33:10 2017
139-FL1-AP03,Microwave Oven,6,18,-51,Fri Dec 15 12:26:56 2017
010-HIGH-FL3-AP04,Microwave Oven,11,10,-55,Mon Dec 18 07:51:23 2017
010-HIGH-FL3-AP04,WiMax Mobile,11,0,-83,Wed Dec 13 16:16:26 2017

提示:

您不需要Python来编写CSV,您可以使用命令行输出到一个文件

python script.py > output.csv

相关问题 更多 >