需要使用Python2.7将.txt文件排序到数据帧中

Incident Number: PD160010001 Incident Type: SUSPICIOUS PERSON(S) EMS Blk: 186605 Fire Blk: 65005 Police Blk: 22145 Location: Location name,22 at XXXX Name RD ,22 Entered: 01/01/16 00:00 Dispatched: 01/01/16 00:00 Enroute: 01/01/16 00:00 On Scene: 01/01/16 00:00 Transport: / / : Trans Complete: / / : Closed: 01/01/16 00:04 01/01/16 00:00 OUTSRV 01/01/16 00:00 DISPOS 22H4 01/01/16 00:00 PREMPT 22H4 01/01/16 00:00 DISPOS 2212 01/01/16 00:00 EXCH 22H4 01/01/16 00:01 ADDER 22H4 01/01/16 00:04 CLEAR 2212 01/01/16 00:04 CLEAR 22H4 01/01/16 00:04 CLOSE 22H4

1条回答

网友

1楼 · 发布于 2024-05-23 18:15:40

你对数据布局的描述模棱两可，所以我做了一些假设。我猜.txt文件看起来像这样：

          header2  header3  header4  header5  header6  header7  header8  header9
index 1   data12   data13   data14   data15   data16   data17   data18   data19
index 2   data22   data23   data24   data25   data26   data27   data28   data29

其中，每个索引对应于某个调用，而每个列对应于调用的某个属性，其标题表示列中的数据表示什么。你知道吗

下面的程序将上述.txt文件转换为一个数据帧并打印出来。你知道吗

import pandas as pd
import re

with open(filename) as file:
    rows = file.readlines()
columns = rows[0] # get the top row
columns = re.sub(' {2,}', ',', columns) # substitute whitespaces of more than
                                     # two spaces with commas
columns = columns.strip().split(',') # turn the row into a list
content = rows[1:] # All but the first row
content = [re.sub(' {2,}',',',row).strip() for row in content] # again, whitespace to commas
content = [row.split(',') for row in content] # turn rows into lists
index = [row[0] for row in content] # take the first element of each row as the index
content = [row[1:] for row in content] # remove index from content
df = pd.DataFrame(data=content, index=index, columns=columns) # Combine into a dataframe
print(df)

这里我们假设列之间至少有两个空格，并且数据中不会有任何双空格。如果列之间的空间大于此值，可以更改regex以查找3个或更多连续的空间。你知道吗

输出为

        header2 header3 header4 header5 header6 header7 header8 header9
index 1  data12  data13  data14  data15  data16  data17  data18  data19
index 2  data22  data23  data24  data25  data26  data27  data28  data29

但你可以做的远不止打印出来，因为它是一个数据帧。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章