使用正则表达式python重新排列文件行的部分

, 0x40a846, mov [ecx+2bh],al, 88 41 2B, , , , \par , 0x40a849, jmp $+001775cbh (0x581e14), E9 C6 75 17 00, , , , \par , 0x40a84e, int3, CC, , , , \par , 0x40a84f, int3, CC, , , , \par , 0x40a850, push esi, 56, , , , \par , 0x40a851, mov esi,ecx, 8B F1, , , , \par

import csv import string import re, sys file_to_change = 'testingthecodexlconverter.csv' # = raw_input("Please specify what codexl file you would like to convert: ") file1 = open(file_to_change, 'r+') with file1 as f: for line in f: line = line[2:-12] line = line.rstrip('\n') + ',,' # mo = re.search(r'(.*?),.*?.*?,.*?(.*?),.*?.*?,.*?(.*?),.*?.*?,.*?(.*?)', line) #mo = re.search(r'(.*?),.*?(.*?,.*?.*?,).*?.*?,.*?(.*?),.*?.*?,.*?(.*?)', line) mo = re.search(r'(.*?),.*?(.*?.*?,\S*?,).*?.*?.*?,.*?(.*?),', line) if mo: print(mo.group(2))

3条回答

网友

1楼 · 编辑于 2024-04-24 23:13:26

我将使用pandas并根据您的需要重新排列列，因为它们看起来是合理的csv格式。此方法还允许您在编辑csv时可视化如何操作csv中的数据：

import pandas as pd
df = pd.read_csv('inputCSV.csv', header=None).fillna('')
df = df.astype(str)
out = df[[4,1,2]].to_csv(index=False, header=False, coding='utf-8', lineterminator='\r\n', mode='wb')

你不清楚每一列的数据格式都是什么。在

我相信你可能在输入的csv文件中丢失了coma。我的建议是搜索这些缺少的逗号，并将它们添加到一个格式正确的输入文件中。在

当然，最快的方法是按照上面提到的使用.split()拆分字符串，但似乎您不确定自己在做什么，因此我建议使用pandas进行解析。在

网友

2楼 · 编辑于 2024-04-24 23:13:26

您可以使用csv模块，该模块已包含在其中，但当前未使用。在

import csv 

file_path = 'test.csv' 

with open(file_path) as csvfile: 
    reader = csv.reader(csvfile) 
    writer = csv.writer(open('tempfile.csv', 'w'), delimiter=',') 
    for row in reader: 
        new_row = [e.strip() for e in row if len(e.strip()) > 0] 
        # The new row should have the first element, then the last,
        # followed by everything else that wasn't empty.
        new_row = [new_row[0], new_row[-1]] + new_row[1:-1] 
        writer.writerow(new_row)

新的csv文件如下所示：

^{pr2}$

网友

3楼 · 编辑于 2024-04-24 23:13:26

您可以按照其他人的建议，在逗号处拆分行，然后在打印时将其添加回去

file_to_change = 'testingthecodexlconverter.csv'

file1 = open(file_to_change, 'r+')

with file1  as f:
    for line in f:
        line = line[2:-12]

        tokens = line.split(',')

        # if column index 3 is empty then print without formatting for
        # unnecessary space.
        if not tokens[3]:
            print(tokens[0] + ", " + tokens[2].strip(" ") + ", " + tokens[1] + ",,,")
        else:
            print(tokens[0] + "," + tokens[3] +  ", " + tokens[2].strip(" ") + ", " + tokens[1] + ",,,")

这将以以下格式打印：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章