将数据从txt提取到csv的Python脚本

2024-05-29 04:03:47 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试编写一个Python脚本,将Wi-Fi数据从txt文件提取到csv文件

以下是txt数据:

Wed Oct  7 09:00:01 UTC 2020

BSS 02:ca:fe:ca:ca:40(on ap0_1)
freq: 2422
capability: IBSS (0x0012)
signal: -60.00 dBm
primary channel: 3
last seen: 30 ms ago
BSS ac:86:74:0a:73:a8(on ap0_1)
TSF: 229102338752 usec (2d, 15:38:22)
freq: 2422
capability: ESS (0x0421)
signal: -62.00 dBm
primary channel: 3

我需要以以下格式将txt数据提取到csv文件:

 Time                        | BSS                       | freq |capability   |signal| primary channel |                                                
 ----------------------------+---------------------------+------+-------------+------+-----------------+                  
 Wed Oct  7 09:00:01 UTC 2020|02:ca:fe:ca:ca:40(on ap0_1)| 2422 |IBSS (0x0012)|-60.00|             3   |
                             |ac:86:74:0a:73:a8(on ap0_1)| 2422 |IBSS (0x0012)|-62.00|             3   |

这是我未完成的代码:

import csv
import re

fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']

re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)

with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    start = False

    for line in f_input:
        line = line.strip()

        if len(line):
            if 'BSS' in line:
                if start:
                    start = False
                    block.append(line)
                    text_block = '\n'.join(block)

                    for field, value in re_fields.findall(text_block):
                        entry[field.upper()] = value

                    if line[0] == 'on ap0_1':
                        entry['BSS'] = block[0]

                    csv_output.writerow(entry)

                else:
                    start = True
                    entry = {}
                    block = [line]
            elif start:
                block.append(line)

当我运行它时,数据没有正确放置

enter image description here

请让我知道如何解决这个问题。我只是一个编程新手,如果有任何建议,我将不胜感激。多谢各位


Tags: csv数据retxtoutputifonline
3条回答

使用str.startswith

Ex:

import csv

fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')
with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    result = {"TIME": next(f_input).strip()}   #Get Time, First Line
    for line in f_input:
        line = line.strip()
        if line.startswith(fieldnames):
            if line.startswith('BSS'):
                key, value = line.split(" ", 1)
            else:
                key, value = line.split(": ")
            result[key] = value
            
    csv_output.writerow(result)

按注释编辑

如果您有以上文本的多个块

import re
import csv

week_ptrn = re.compile(r"\b(" + "|".join(('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')) + r")\b")
fieldnames = ('TIME', 'BSS', 'freq','capability', 'signal', 'primary channel')

with open(filename) as f_input, open(outfile,'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    result = []    #Get Time, First Line
    for line in f_input:
        line = line.strip()
        week = week_ptrn.match(line)
        if week:
            result.append({"TIME": line})
            
        if line.startswith(fieldnames):
            if line.startswith('BSS'):
                key, value = line.split(" ", 1)
            else:
                key, value = line.split(": ")
            result[-1][key] = value
            
    csv_output.writerows(result)

你试图用“时间”来搜索时间。但输入数据中没有“时间”。 因此,空时间输出是一种自然现象

而且我觉得跟线也有问题

            if line[0] == 'on ap0_1':
                entry['BSS'] = block[0]

在我看来,您试图找到BSS ac:86:74:0a:73:a8(on ap0_1)中的on ap0_1。 但是第[0]行是“BSS”,它是['BSS','ac:86:74:0a:73:a8(on','ap0_1')的第一行。应该这样改变:

            if 'on ap0_1' in block[0]:
                entry['BSS'] = block[0][4:].lstrip()

这是我的代码版本

import csv, re

fieldnames = ['TIME', 'BSS', 'FREQ','CAPABILITY', 'SIGNAL', 'CHANNEL']
re_fields = re.compile(r'({})+:\s(.*)'.format('|'.join(fieldnames)), re.I)

with open('ap0_1.txt') as f_input, open('ap0_1.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames= fieldnames)
    csv_output.writeheader()
    start = False
 
    time_condition = lambda @l: l.startswith('Mon') or l.startswith('Tue') or \ 
                     l.startswith('Wed') or l.startswith('Thu') or l.startswith('Fri') \ 
                     or l.startswith('Sat') or l.startswith('Sun')
    
    row = dict{}
    for line in f_input:
        line = line.strip()
        if not line:
            continue
        elif time_condition(line):
            row['TIME'] = line
        else:
            # not sure how you define the start of a new block, say, it is by 'BSS' string
            key, value = line.split(' ', 1) # split one time exactly
            key = key.rstrip(':').upper()
            if key == 'BSS' and row:
                row = (row.get(k, '') for k in fieldnames)
                csv_output.writerow(row)
                row = dict()
  
            row[key.upper()] = value
    row = (row.get(k, '') for k in fieldnames)
    csv_output.writerow(row)   

看起来“\n”会创建空行

相关问题 更多 >

    热门问题